Test, Don't (Just) Verify

AI vows bug-free code; commenters split between 'prove it' and 'teach us how'

TLDR: The piece argues AI will push math-backed code proofs into the mainstream to kill bugs. Commenters split between hype skeptics, safety-first pragmatists, and folks begging for tutorials—highlighting that safer software needs clear specs and proofs, not just more tests, to matter in the real world.

AI’s latest promise: use math to prove software is correct. The article says proof tools like Lean are booming, startups are flush with cash, and even big-name academics (Terry Tao, Martin Kleppmann, Ilya Sergey) are cheering. But the comment section? Instant split-screen drama. One camp nods along—AI helps us write what we can verify, not just what we can type. Another camp slams the brakes: “This nonsense again. No, it isn’t.” Popcorn deployed.

The practical crowd chimes in: we can ship code fast, but we can’t ship it safely—so proofs could be the new seatbelt. Others demand receipts: “Cool pitch, now give us a tutorial on Verification‑Guided Development,” and they even flag broken links. Teachers-at-heart add: learning invariants—simple rules that must always be true—makes everyday testing and assert() checks better, even if you never do full-on math proofs. Fans point to that famous case where a proved C compiler found bugs in the big-name compilers. Skeptics counter that most software doesn’t even have a clear spec, so what exactly are we proving?

A link-dumper keeps the discourse spicy with a previous thread. The vibe: hype train vs. homework club. Memes fly—“Lean into Lean,” “AI will eat your bug report”—but the chorus is clear: show, don’t tell.

Key Points

  • The article claims AI is accelerating the mainstreaming of formal verification, with rising use of proof assistants like Lean.
  • Two major obstacles are highlighted: most software lacks formal specifications, and proof engineering is complex and domain-specific.
  • LLMs are presented as naturally fitting specification-driven development, encouraging executable specifications and iterative optimization loops.
  • Testing cannot prove absence of bugs; the article cites SQLite as an example of tests missing defects.
  • CompCert’s verified compilation passes showed no bugs under random testing, contrasting with numerous bugs found in GCC and Clang.

Hottest takes

"This nonsense again. No. No it isn’t." — badgersnake
"We can write code a lot faster than we can safely deploy it" — getregistered
"...write a follow-up... Verification‑Guided Development?" — esafak
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.