April 13, 2026

Verified code, unverified chaos

Lean proved this program correct; then I found a bug

Commenters cry clickbait: the “bug” wasn’t in the app—it was the engine

TLDR: AI fuzzing found no flaws in the math-checked code, but did expose a crash in Lean’s runtime and a denial-of-service in an unverified parser. Commenters clapped back at the “gotcha” framing and argued the real lesson: proofs are powerful, but only as strong as the spec and the surrounding system.

A researcher pointed AI tools at a “proven-correct” compression tool built in the Lean proof system—and the internet immediately lit up. The headline teased a bug in the verified code, but commenters pounced to say the truth was way messier: the verified part held up, while the crash came from the Lean runtime itself. Translation for non-nerds: the app’s logic was fine; the engine under the hood coughed.

The crowd split into camps. One side rolled their eyes at the framing—“no bugs in the proven code,” said the top-voted vibe, accusing the post of drama. Another group shouted, “it’s always the parser!” after a denial-of-service was found in an unverified file reader. Meanwhile, spec philosophers turned up to warn that proving the wrong thing perfectly is still wrong, sparking think-pieces about the “spec gap” and how formal proofs can’t save you from bad assumptions or shaky runtimes.

Drama bonus: the post name-dropped AI agents and hyped tales of unreleased “too dangerous” models, feeding the AI-doomer meme machine. Jokes flew fast—“you proved the lock, but the doorframe is rotten,” “proof assistant needs a proof-of-life,” and “Lean runtime got a little too lean.” Tools like AFL++ got shoutouts, but the community’s verdict was loud: proofs help—but only within their borders, and everything around them still needs hardening.

Key Points

  • A formally verified Lean-based zlib implementation (lean-zip) was fuzz-tested using an AI agent and multiple analysis tools.
  • No memory vulnerabilities were found in the verified Lean application code after over 105 million executions.
  • A heap buffer overflow was discovered in the Lean 4 runtime (lean_alloc_sarray), with a bug report filed and fix pending.
  • A denial-of-service issue was found in lean-zip’s archive parser, which was not part of the verified specification.
  • The experiment underscores that formal proofs held for the verified code while vulnerabilities existed in the runtime and unverified components.

Hottest takes

"The author, in fact, found no bugs or errors in the proven code." — ctmnt
"Not verifying the parser seems like a pretty big oversight." — lmm
"If your program was for the wrong thing, a proof of it is also wrong." — dchftcs
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.