6 Practices that turned AI from prototyper to workhorse (106 PRs in 14 days)

One dev, 106 pull requests: fans cheer, skeptics cry “AI wrote this”

TLDR: An open-source toolkit claims one dev shipped 106 changes in 14 days by tightly choreographing multiple AI bots, boosting quality but taking longer. Commenters split between excitement and suspicion, mocking the marketing, asking for proof, and debating whether strict “rails” make AI powerful or just overhyped packaging.

An engineer claims they turned AI from a flaky prototyper into a 24/7 factory hand using six strict rules, and the internet has Thoughts. The open-source toolkit, Codev, puts plans in the codebase, uses three separate AI “reviewers” to catch different mistakes (one even spotted a serious security issue another missed), and forces a step-by-step checklist so the bots can’t skip chores. Result: 106 pull requests in 14 days, work equal to 3–4 people, at around $1.60 per change, though it takes longer to run. Links? The author is in the thread waving receipts: GitHub and a tour with raw results.

But the crowd is split. One commenter throws a grenade: “This original post looks AI-generated,” demanding the prompts. Another rolls their eyes at branding, quoting the “not a model, not an assistant, not an extension” spiel and snapping back, “Thanks for the clarification,” dripping sarcasm. Meanwhile, the meme machine kicks in: “Would you rather fight 100 AI workhorses or 1 workhorse AI?” becomes the day’s goofy poll. A veteran dev draws parallels to GitHub’s now-abandoned Spec Kit and asks for better command names and a friendlier cheat sheet. The vibe: hype vs. homework. Fans love the discipline (especially the three-reviewer safety net); skeptics see buzzword orchestration until proven otherwise. And everyone agrees: the rails are doing the real work — and they’re controversial.

Key Points

  • Specs and plans are stored in git with source code, ensuring context and traceability for AI agents.
  • A multi-model review using Claude, Gemini, and Codex caught 20 bugs pre-shipping; no single model found more than 55% of issues.
  • A state machine enforces the Spec → Plan → Implement → Review → PR process, requiring passing tests before advancement.
  • Agents coordinate other agents: an architect agent directs builder agents working in isolated git worktrees and communicating asynchronously.
  • The workflow managed the full lifecycle and produced 106 PRs in 14 days, with code quality 1.2 points better than Claude Code and a cost of about $1.60 per PR.

Hottest takes

"This original post looks AI-generated." — trollbridge
"Thanks for the clarification, I couldn't have guessed otherwise." — skydhash
"Would you rather fight 100 AI workhorses or 1 workhorse AI?" — ddoottddoott
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.