Anthropic silently downgraded cache TTL from 1h → 5M on March 6th

Anthropic’s quiet cache switch has users fuming over higher bills and broken trust

TLDR: Users say Anthropic shrank its “prompt memory” from 1 hour to 5 minutes, raising costs and pushing people over limits. Comments split between cash‑grab accusations, a server‑strain theory, trust‑me pleas, and jokes about confusing “5m” with five months — all while quality concerns add fuel to the fire.

Plot twist: power users say Anthropic quietly shortened how long it keeps “prompt memory” from one hour to just five minutes — and then watched bills climb. In plain English: the system’s short-term savings jar got smaller, so people had to refill it more often, paying extra each time. The data hounds claim the shift started March 6 and by March 8 the five‑minute window dominated, triggering a 20–32% spike in costs and new quota hits for folks who’d never hit limits before. Cue the comment-section fireworks. Some are connecting it to Anthropic’s late‑March peak‑hour woes, arguing the mini‑memory caused more system load, which then justified throttling. Others are going full legal-thriller, saying the breadcrumb trail of logs makes “proving damages” look easy. The vibe? Trust is wobbling. One plea summed it up: we’ll pay more, just stop the stealthy switches. There’s even nerd comedy: a mini flame war over whether “5M” reads as five minutes or five months, while pedants chant “it’s ‘min’!” Meanwhile, frustration spills into product quality chatter — the infamous “car wash question” fails are back, with gripes about the system over‑dramatizing tasks and tapping out. If you’re counting pennies, the pricing link suddenly looks scarier — and Reddit’s bringing popcorn.

Key Points

  • Analysis of 119,866 Claude Code API calls (Jan 11–Apr 11, 2026) indicates Anthropic shifted default prompt cache TTL from 1 hour to 5 minutes in early March 2026.
  • Data from two machines (Linux and Windows) and different accounts showed identical timing: 33+ days of 1h-only TTL through Mar 5, then a transition Mar 6–7, and 5m dominance from Mar 8 onward.
  • Day-level logs show first 5m tokens reappearing Mar 6; by Mar 8 about 83% of cache writes were 5m TTL.
  • No client-side changes occurred during the period; TTL tier is set server-side by Anthropic, according to the analysis.
  • Using Anthropic pricing, the shift increased cache creation costs by an estimated 20–32% and produced $949.08 (17.1%) excess cost across the dataset, with March showing ~25.9% waste.

Hottest takes

“proving damages and a pattern is becoming trivial” — sscaryterry
“This coincides with [the] peak-hour announcement… throttling [a] response to load inflated by the TTL regression?” — Tarcroi
“The SI symbol for minutes is ‘min’, not ‘M’” — cassianoleal
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.