Toward automated verification of unreviewed AI-generated code

AI code with no human eyes? Devs clap back: Not on my watch

TLDR: An engineer proposes shipping AI-written code without human review by relying on heavy automated tests and checks. Commenters love the idea in theory but slam the FizzBuzz demo as too simple, warn about high costs, and insist code reviews and test reviews are still essential.

Engineer Peter Lavigne says he’s ready to ship AI-written code without reading it—as long as machines “verify” it first. His test run used a simplified FizzBuzz and a robot gauntlet: property-based tests (randomized checks), mutation tests (inject tiny bugs to see if tests catch them), no side effects, plus Python type checks. He argues maintainability doesn’t matter because we should treat this stuff like compiled output. Overhead is high now, but he shared a repo and a post saying the baseline can improve.

Cue the fireworks. One camp cheers the tooling—“this will turbocharge testing”—but the louder chorus is skeptical. “These tests are pricey,” warns jghn, calling mutation testing compute-hungry. “Skipping code review is an absolute mistake,” says tedivm, citing too many AI faceplants. Ancalagon throws a classic paradox: if code goes unreviewed, who reviews the tests. And phailhaus drags the demo: FizzBuzz is trivial and proves nothing about growing a real product; sharkjacobs can’t see it scaling.

Memes fly: “AI intern left unsupervised,” “trust but verify but who verifies the verifier,” and “FizzBuzz-driven confidence.” Fans point to JustHTML and “no-human” Software Factory experiments; critics demand real-world complexity and cost math. For now, the vibe is: cool demo, come back with something messier

Key Points

  • The author generated a simplified FizzBuzz solution with a coding agent and verified it using automated constraints.
  • Verification included property-based testing, mutation testing, enforcing no side effects, and Python type checking and linting.
  • He found these checks sufficient to trust AI-generated code without manual line-by-line review.
  • Maintainability and readability are considered less relevant in this context, treating output like compiled code.
  • The setup has higher overhead than manual review today but provides a baseline, with a Python repo available demonstrating the approach.

Hottest takes

"they can be computationally expensive, especially mutation testing." — jghn
"it is an absolute mistake at this point in time." — tedivm
"Using FizzBuzz as your proxy for "unreviewed code" is extremely misleading." — phailhaus
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.