March 8, 2026
Docs vs Bots: Fight Night
New Research Reassesses the Value of Agents.md Files for AI Coding
Study says auto-made cheat sheets slow AI; devs split between “do it right” and “ditch it”
TLDR: An ETH Zurich study finds AI-written project docs can slow coding bots and raise costs, while human-written notes help only a little. Commenters are split between “docs done right are vital,” “these docs cause fake busywork,” and “ditch them for task-specific guidance”—a real-time debate on how to handle AI assistants.
ETH Zurich just poked a hornet’s nest: their new study says those AGENTS.md files—project “cheat sheets” for code bots—often don’t help, and when auto-written by AI they may actually make things worse. The numbers weren’t huge but they were spicy: AI-written docs dropped success rates slightly and cranked up costs, while human-written notes gave a small bump but still made bots run more steps, like AI interns getting graded on “busy.”
Cue the comment section riot. One camp, led by verdverm, says the headlines are pure clickbait and insists good docs are gold if “done well.” Another camp, like noemit, plays referee: the study mostly dunked on auto-written docs, not humans—yet warned these files can push agents into “fake thinking” (extra tests, extra scans, extra everything) without better fixes. Meanwhile, stingraycharles looked around at all the downvotes and asked, basically, “Is this a glitch or a mood?”
Then came the futurists. nayroclade called AGENTS.md a temporary relic from the “treat your bot like a junior dev” era: once agents think like seniors, they’ll make their own calls and won’t need hand-holding. Practicals chimed in too: skip the big permanent doc, give short, task-specific notes and move on.
Bottom line? The study rattled the “docs fix everything” crowd, and the comments turned into a culture war: write better docs vs don’t over-script the bots. Somewhere in the middle, everyone agrees on one thing—busywork isn’t brilliance.
Key Points
- •ETH Zurich researchers evaluated repository-level AGENTS.md files and found they often hinder AI coding agents.
- •They introduced AGENTbench with 138 real-world Python tasks from niche repositories to avoid benchmark memorization.
- •Across four agents, LLM-generated context files reduced success by ~3% and increased steps and inference costs by >20%.
- •Human-written context files yielded a modest ~4% success gain but still increased steps and costs by up to 19%.
- •Trace analysis showed agents follow AGENTS.md instructions, triggering unnecessary extra work; authors recommend omitting LLM-generated files and limiting human-written content to non-inferable details.