June 25, 2026
Benchmarks? More like bench-sharks
Lies, Damn Lies and Database Benchmarks
Why “fastest database” claims have commenters screaming rigged race vibes
TLDR: QuestDB argues that popular “fastest database” tests can reward tools that are tailored for the contest instead of actually better overall. Commenters mostly reacted with cynical laughter, saying every benchmark gets gamed sooner or later — with several comparing it to today’s AI leaderboard drama.
A spicy new post from QuestDB basically says the tech world’s favorite flex — benchmark charts — can be wildly misleading, and the comments instantly turned into a group therapy session for people traumatized by “Number 1” performance claims. The article uses a gloriously goofy analogy: comparing databases can be less like a fair footrace and more like making runners sprint while whistling “Yellow Submarine.” In plain English, the company argues that when different kinds of data tools are judged with one test, the winner may just be the one best tuned for that exact contest, not the one that’s best in real life.
That was enough to set off a familiar online chorus: if there’s a benchmark, somebody will game it. One commenter said it reminded them of the recent Terminal Bench mess, while another sighed that this is now basically the story of AI chatbot scoreboards too. The strongest mood in the thread? Deep cynicism mixed with a weird amount of respect. People were nodding along to the claim that companies optimize for the test, not the truth, but also praising the post itself as unusually honest and well written.
The tiny drama twist: not everyone fully bought the complaint. One old-school veteran rolled in with a history lesson about the 1990s database wars, saying this kind of benchmark juicing is ancient news — and adding that if you’re mad about “cold start” tests, you also need to explain how often that really matters in the real world. Meanwhile, another commenter dropped a nerdy subtweet in the form of a DuckDB benchmarking essay basically saying: yes, fair testing is hard, and yes, this mess has receipts.
Key Points
- •The article argues that database benchmarks can mislead when readers treat them as universal proof that one database is faster than another.
- •It uses ClickBench as an example benchmark, describing a workload of roughly 100 million rows, 105 columns, and 43 analytical queries.
- •ClickBench measures both cold and hot query runs, with hot runs based on repeated executions after an initial cache-warming run.
- •The article states that cold-run conditions are asymmetric because self-hosted systems can have caches cleared and servers restarted, while managed cloud services generally cannot.
- •The article says it will focus on hot-run overall scores and cites ClickBench's ratio formula for comparing each query against the fastest system.