Compressed filesystems à la language models

AI runs a pretend hard drive; devs swoon, skeptics warn of GPU bills

TLDR: A developer trained an AI to mimic a filesystem, hitting strong test accuracy and sparking big reactions. Nostalgia and builder excitement collided with warnings about GPU costs, short memory limits, and text-only support—making this a flashy proof-of-concept that has the community split on real-world usefulness.

A coder trained a small AI to act like a “pretend hard drive,” answering file reads and writes like a chatty librarian, and the crowd went off. Old-school fans cheered the vibe, with one user quoting the line that every engineer secretly wants to build a filesystem and dropping a nostalgic tale of TRS‑80 storage woes. Builders got hyped too: one commenter said this was exactly the weekend project they were about to start, turning the thread into a virtual hackathon. But the buzz wasn’t all rosy. The top skeptic listed brutal caveats: you need a big AI model, probably a pricey graphics card (GPU), the model’s short memory window means it forgets things fast, and it only handles text. Translation: cool demo, questionable practicality. Meanwhile, nerd culture cameos arrived—someone linked Fabrice Bellard’s experimental ts_zip as a spiritual predecessor, and a glorious “mgddbsbdbd ddfk,d ,” comment became the meme of the day (“the filesystem compressed their sentence”). For context: the author trained on simulated file actions (via FUSE, a way to build pretend filesystems in userspace) with neat XML snapshots, hitting ~98% test accuracy on a small Qwen model. The vibe? Hackers dream, skeptics simmer, memes flourish.

Key Points

•A loopback FUSE filesystem with logging was built to generate reference training data.
•A simulator produced diverse prompt–completion pairs, capturing minimal read/write operations and full filesystem state.
•XML was selected to represent the filesystem tree in prompts for clarity and parsing reliability.
•The model (Qwen3-4b) was fine-tuned via SFT on Modal, achieving ~98% accuracy on a hold-out eval after 8 epochs and ~15,000 examples.
•A minimal FUSE filesystem was implemented where every operation is delegated to the LLM for responses.

Hottest takes

"Every systems engineer at some point in their journey yearns to write a filesystem" — PaulHoule

"this was the first step that i was gonna work on this weekend" — endofreach

"you need an LLM, likely a GPU, all your data is in the context window (which we know scales poorly), and this only works on text data." — N_Lens

November 26, 2025

File this under chaos

AI runs a pretend hard drive; devs swoon, skeptics warn of GPU bills

Key Points

Hottest takes

November 26, 2025

File this under chaos

Compressed filesystems à la language models

AI runs a pretend hard drive; devs swoon, skeptics warn of GPU bills

Key Points

Hottest takes

Save News