My LSM tree was slower than a B-tree. Then I profiled it

He built a faster database, and the comments instantly put it on trial

TLDR: A developer rebuilt a database engine and boosted it from slow to blazing fast by carefully tracking where time was being wasted. Commenters, however, were obsessed with the messy parts: whether the speed claims were safe, whether some mistakes were embarrassingly obvious, and whether AI wrote part of the code.

A programmer tried to build the kind of storage system used under the hood by big-name databases, only to discover his homemade version was slower than the old-school rival it was supposed to beat. After some serious measuring and tinkering, he pushed it from a limp 250,000 writes a second to nearly 2 million. Impressive? Yes. But the real fireworks happened in the comments, where readers treated the whole thing like a live courtroom drama.

The loudest reaction was basically: "Cool speed boost, but is it even safe?" One commenter immediately slammed the brakes, warning that if you delay writing data to disk, you risk losing it in a crash, joking that "/dev/null is a webscale database" if all you care about is fake speed. Ouch. Others piled on over a hilariously bad filter setting in the project, mocking the idea that this kind of bug could only be found by testing. One especially snarky reply sneered that in the "prehistoric days" developers could actually think before running code. Meanwhile, another commenter played internet detective and dropped a GitHub repo link, then roasted an allegedly AI-generated benchmark tool.

So while the article is a satisfying underdog story about finding bottlenecks and fixing them one by one, the crowd turned it into something juicier: a debate about speed vs safety, craftsmanship vs "vibecoding," and whether clever profiling can save you from basic mistakes. In other words, classic programmer comment-section chaos.

Key Points

•Aasheesh Rathour built an LSM-tree storage engine in Go to better understand how RocksDB-style systems work.
•His first implementation achieved about 250,000 sequential writes per second and 120,000 random writes per second, making it slower than BoltDB in his tests.
•The article explains that B-trees suffer on write-heavy workloads because inserts become random disk writes, while LSM trees convert writes into mostly sequential disk activity through MemTables and SSTables.
•Profiling with pprof identified write syscalls, garbage collection, and sorting as the main CPU bottlenecks in the initial implementation.
•Replacing per-batch WAL writes with a memory-mapped file reduced write-syscall CPU usage from 34% to 2.16%, after which compaction became the next major optimization target.

Hottest takes

"/dev/null is a webscale database" — jmalicki

"maybe if you're vibecoding it" — Retr0id

"I'm very amused by this obviously AI-generated benchmark program" — teraflop

June 18, 2026

Fast code, furious comments

He built a faster database, and the comments instantly put it on trial

Key Points

Hottest takes

June 18, 2026

Fast code, furious comments

My LSM tree was slower than a B-tree. Then I profiled it

He built a faster database, and the comments instantly put it on trial

Key Points

Hottest takes

Save News