April 28, 2026
Cheap first, fancy later
We decreased our LLM costs with Opus
They saved money with a smarter gatekeeper, but commenters say the title did all the heavy lifting
TLDR: The company says it lowered costs by using a cheaper AI helper to weed out repeat problems before waking the expensive one. Commenters mostly agreed that’s the whole story—while roasting the headline, mocking the complexity, and asking why simple software rules weren’t enough.
A company says it cut its artificial intelligence bill by putting a cheaper helper in front of the pricey brain: most broken software tests get stopped early as repeats, and only the genuinely new messes are sent up the chain. Instead of dumping giant log files into the system, they let the tool go fetch what it needs. On paper, it’s a tidy money-saving story. In the comments, though, the crowd was far more interested in calling out what they saw as headline gymnastics and asking whether the whole thing could have been explained in one brutally simple sentence.
That one-liner became the unofficial meme of the thread. Multiple commenters basically rewrote the article as: “Let a cheap agent decide if the expensive one is needed.” Ouch. One reader flat-out called the title misleading, while another accused the post of being clickbait dressed up as architecture wisdom. Then came the practical skeptics: why use an AI middleman for tasks like “have we seen this before” when, as one commenter snarked, “re.match() is cheaper”? Others pushed back on the article’s claim that you shouldn’t guide the investigation too much, arguing that giving helpful clues isn’t bias, it’s just common sense.
So yes, the company’s trick is real: a low-cost screener filters out duplicates and the expensive model only handles the hard stuff. But the real show was the comment section, where readers turned a cost-cutting case study into a referendum on clickbait titles, overcomplicated AI workflows, and whether some problems still just want a plain old rule-based tool.
Key Points
- •The article says the team reduced LLM costs by routing most CI failures through a cheaper Haiku triage agent and escalating only unresolved cases to Opus 4.6.
- •In one reported week, the system analyzed about 4,000 CI failures, of which 818 were new problems and 3,187 were already known issues.
- •Haiku duplicate detection uses prior error messages plus exact-match search and semantic search with pgvector to identify recurring failures.
- •The system handles very large logs by giving agents SQL access to ClickHouse rather than placing large log payloads directly into prompts.
- •Opus acts as the planner for deeper investigations, while Haiku sub-agents perform constrained evidence-gathering tasks with one-level-deep delegation.