April 2, 2026
SQL slapfight, AI delight
Enabling Codex to Analyze Two Decades of Hacker News Data
AI digs up 20 years of Hacker News — commenters demand receipts
TLDR: An AI tool (with Modolap) analyzed 20 years of Hacker News to compare language mentions and found comments may be shrinking. Commenters erupted over unclear methodology, asked if the data is truly open, pushed for plain SQLite instead, and mocked ambiguity like counting the word “Go” — methods matter.
An AI-fueled deep dive into two decades of Hacker News chatter lit up the comments more than the charts did. The post claims Codex, paired with a homegrown tool called Modolap, sifted a 10GB dataset to track shout-outs like Rust vs Go, Postgres vs MySQL, and whether comments are getting shorter. Early takeaway: a gradual shrink in comment length. The community’s takeaway: “Wait, what even is Modolap?”
Skeptics pounced. One top voice questioned why Modolap exists at all, asking how it’s different from just using any standard analytics engine. Another camp said: skip the fancy stuff and just use SQLite, with a commenter bragging it’s a one-prompt setup. Then came the “Go means go” problem: how do you count mentions of a programming language when it’s also, well, a normal word? A cheeky misread—“5% of comments mention Claude Code?!”—sparked laughs and side-eyes at the stats. Meanwhile, practical folks simply asked whether the data is actually open (it is, via Hugging Face).
So the vibe? Tooling turf war meets word-nerd chaos. Fans of AI-assisted analysis are excited; purists want clean methods and clear docs. The real headline isn’t Rust vs Go—it’s AI vs SQL, hype vs homework, and the eternal Hacker News sport of calling out fuzzy methodology.
Key Points
- •The full Hacker News dataset (~10GB) is stored in Apache Parquet files and available on Hugging Face.
- •The author uses Codex with the Modolap skill (added via npx) to analyze the dataset.
- •Codex generates queries for historical keyword-based topic mention analysis.
- •Example comparisons include Rust vs Go, Codex vs Claude Code, and Postgres vs MySQL.
- •An initial review suggests a gradual decline in median and average comment length over time.