Compressed filesystems à la language models

https://news.ycombinator.com/rss Hits: 8
Summary

Every systems engineer at some point in their journey yearns to write a filesystem. This sounds daunting at first - and writing a battle-tested filesystem is hard - but the minimal surface area for a “working” FS is surprisingly small, simple, and in-distribution for coding agents.In fact, one of my smoke tests for new coding models is seeing how good of a filesystem they can one-shot! At some point, I had quite a few filesystems lying around - and coding models were getting pretty good - which made me wonder if the models were intelligent enough to actually model the filesystem engine itself?A filesystem is the perfect black-box API to model with wacky backends (see “Harder drives”), and besides the joy of training an LLM for fun - there were a few deeper truths about language models that I wanted to explore.Training a filesystem #So I set upon training a filesystem. Building on top of one of my throwaway FUSEs, a few rounds with Claude repurposed it to loopback against the host with added logging, two things I needed to generate reference fine-tuning data:class LoggingLoopbackFS(LoggingMixIn, Operations): """ A loopback FUSE filesystem that logs all operations for training data. This implementation delegates all filesystem operations to a real directory on the host filesystem, ensuring perfect semantic correctness while logging every operation for LLM training data. """ I then wrote a filesystem interaction simulator, which sampled various operations against a sandboxed LoggingLoopbackFS to generate diverse FUSE prompt/completion pairs. Concretely, I captured only the minimal set of operations needed for R/W-ish capability (no open, xattrs, fsync etc).Alongside the FUSE operation, I captured the full filesystem state at every turn. I experimented with various formats, including an ASCII-art representation, but ultimately settled on XML since it enforces prompt boundaries clearly and had canonical parsers available.With prompts including the FUSE operation + XML fi...

First seen: 2025-11-26 22:32

Last seen: 2025-11-27 05:33