Claude Code's binary reveals silent A/B tests on core features

Paying devs say they’re lab rats; defenders fire back with “read the fine print”

TLDR: A user claims Claude Code ran silent experiments that changed how its planning feature works, triggering outrage from paying customers who felt used as test subjects. The community split fast: pros demand transparency and opt‑outs, while others say evolving AI tools can change under the terms—raising trust stakes for work-critical software.

The internet is in full meltdown mode over a bombshell claim: a paying user says they decompiled the Claude Code app and found secret A/B tests silently changing a core feature called “plan mode.” In plain English: some customers allegedly got a “cap” version that strips context, blocks prose, and slams plans to 40 lines—no conversation, just a wall of bullets. Cue rage. One camp is fuming: “This isn’t Instagram—I use this to work,” as posts demand transparency, opt-outs, and “no mystery switches” in a $200/month tool.

But the backlash has a backlash. A loud chorus is waving the terms of service, with one commenter snapping, “Did you assume a $200 subscription meant a frozen product?” Meanwhile, the cynics are having a field day: “Professional tool? Large language models aren’t even consistent.” Others are memeing the codename—“tengu_pewter_ledger” sounds like a Dark Souls boss—and joking that “cap” is fitting because it feels like, well, “cap.” Razengan drops the “I knew it” receipt, and the thread turns into a tug-of-war over trust versus iteration.

Amid the jokes, a practical question cuts through: is the test tied to the app install or the user account? Either way, the community’s headline is clear: don’t A/B test our workflow in silence—tell us and let us toggle it.

Key Points

•A user decompiled the Claude Code binary and found a GrowthBook-managed A/B test called “tengu_pewter_ledger.”
•The test controls how plan mode writes its final plan, with four variants: null, trim, cut, and cap.
•The “cap” variant limits plans to 40 lines, forbids context and prose, and prioritizes keeping file paths over narrative text.
•The user reports being assigned the “cap” variant, resulting in an auto-generated, terse plan with no Q/A phase or steering.
•Telemetry logs at “tengu_plan_exit” record the assigned variant and metrics (e.g., plan length, outcome), enabling performance correlation.

Hottest takes

“I knew it” — Razengan

“LLMs offer none of this, and A/B testing is just further proof” — reconnecting

“Did you assume a $200 subscription meant a frozen product?” — handfuloflight

March 14, 2026

Plan Mode, Meet Panic Mode

Paying devs say they’re lab rats; defenders fire back with “read the fine print”

Key Points

Hottest takes

March 14, 2026

Plan Mode, Meet Panic Mode

Claude Code's binary reveals silent A/B tests on core features

Paying devs say they’re lab rats; defenders fire back with “read the fine print”

Key Points

Hottest takes

Save News