Ask HN: How are you LLM-coding in an established code base?

Startup claims $1k/mo buys '1.5 extra devs' — skeptics, tinkerers, and vibe coders clash

TLDR: A startup says $1k a month in AI tools equals 1.5 extra engineers, thanks to bots cranking tests and pull requests. The community fires back: skeptics say AI fails on real code, pragmatists demand better end-to-end testing, and old-school “vibe coders” just turn up the playlist.

A tiny startup just told Hacker News it pays about $1,000 a month for a swarm of AI coding tools and gets the equivalent of “1.5 extra junior/mid engineers” per developer. Cue the drama. The thread lit up with sharp divides: one camp says AI sidekicks are solid at writing tests and tidying code, the other says they crumble on real-world projects. The spiciest skeptic? dazamarquez, who uses AI for boring unit tests and then drops the hammer: “That aside, it’s pretty much useless.” Their gripe: AI can’t see enough of a big codebase at once, and costs stack up fast.

On the meme side, bitbasher declared they “vibe code” with a text editor and music player, turning the debate into AI interns vs coffee-fueled coders. Meanwhile, Sevii brought order to the chaos, arguing the startup’s slowdown is self-inflicted: build end-to-end tests and feed the results back to bots so they don’t keep guessing. Others, like jemiluv8, cheered the money math: if it cuts costs and raises quality, it’s a win. And weeksie shared a playbook where multiple AI tools, strict rule files, and stacked pull requests churn through reviews—though even they admitted some bots get “chatty.”

Bottom line: everyone agrees AI shines at grunt work, but the battle is over whether it can actually ship production-ready code without a grown-up testing pipeline.

Key Points

  • Startup uses LLMs across a monorepo with Python workflows and two Next.js apps, deploying via GitHub CI/CD to GCP and Vercel.
  • Engineers have access to Cursor Pro (Bugbot), Gemini Pro, OpenAI Pro, and optionally Claude Pro; model choice is flexible.
  • Issues are assigned to GitHub Copilot, which opens PRs; about 25% are mergeable as-is, ~50% with comments.
  • Coding standards are enforced via .cursor/rules and pre-commit hooks; Turborepo and uv are used for repo and Python management.
  • Pain points include complex local verification, lack of an end-to-end service to spin up infra and tests, model selection limits in Copilot, and friction with Cursor agents/worktrees.

Hottest takes

"That aside, it's pretty much useless." — dazamarquez
"I generally vibe code with vim and my playlist in Cmus." — bitbasher
"Not only is your lack of an integration testing pipeline slowing you down, it's also slowing your AI agents down." — Sevii
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.