Thoughts on the Word Spec in Rust

https://news.ycombinator.com/rss Hits: 7
Summary

Tritium lives in the Word spec because to deliver great legal tech, we think we need to own the word processor. The Word spec is giant. It provides that a valid docx file may contain something like the below XML: <body> <tbl> ... <p> ... <tbl>...</tbl> </p> </tbl> </body> It thus supports essentially infinite nesting of paragraphs and tables in other words. And since Word was written in C/C++ and happy to work with multiple mutable ownership, it's no problem to have these deeply nested structures. But they're hard to do right in Rust. So, where to start? An excellent first place was the docx_rs crate maintained by bokuweb. bokoweb's work seems to follow along the lines of python-docx in creating an excellent API for generating Word documents. From the repo: use docx_rs::*; pub fn hello() -> Result<(), DocxError> { let path = std::path::Path::new("./hello.docx"); let file = std::fs::File::create(path).unwrap(); Docx::new() .add_paragraph(Paragraph::new().add_run(Run::new().add_text("Hello"))) .build() .pack(file)?; Ok(()) } It also supports reading. To ingest a Word file with libtritium would look something like the below. pub fn main() { let bytes = libtritium::fs::slurp_path("./hello_world.docx").unwrap(); let docx = docx_rs::read_docx(&bytes).unwrap(); let Some(docx_rs::documents::DocumentChild(paragraph)) = docx.children.first() else { panic!("Expected a paragraph."); }; println!("{}", paragraph.raw_text()); } // Hello, World! As a great Rust crate, it compiles to WASM and can be run on Web front ends. Amazing. It was instrumental in getting Tritium's first alpha versions of the ground. But today, Tritium runs a custom docx module, written from scratch. Why? As with many other endeavours, if it's your core product, you need to own the stack or at least have control over its destiny. Tritium's core offering is making surgical edits to legacy legal documents. While it doesn't have to implement the entire Word spec to be useful, Tritium needs to survive the below ro...

First seen: 2025-10-09 07:18

Last seen: 2025-10-09 13:19