June 6, 2026

Cloudy with a chance of gaslighting

Lambda isn't leaking memory, your metrics are lying to you

Lambda’s not broken — the memory meter was gaslighting everyone, and the comments went feral

TLDR: The big reveal: the scary Lambda memory number wasn’t showing each request, but a running all-time peak, which made a normal spike look like a leak. Commenters turned it into a three-way brawl over misleading dashboards, old memory-management baggage, and AI-generated tech writing fatigue.

This story had all the ingredients for a classic tech meltdown: a customer’s cloud app kept ballooning from a few hundred megabytes to 9 gigabytes before getting unceremoniously killed. The team did what seemed obvious — shrink the cache, keep less stuff around — and somehow made the disaster way worse, triggering hundreds of crashes in hours. But the real plot twist was delicious: the memory number they were trusting apparently never resets, so the scary graph that looked like a leak was really more like a lifetime high score. In plain English, the dashboard was acting dramatic, and everyone believed it.

And oh, the comment section had thoughts. One camp zeroed in on the writing itself, with VulgarExigency roasting the now-familiar AI-polished postmortem style and joking that it read like “Claude write me a post-mortem” with a giant hero image slapped on top. That sparked a very 2026 side-drama: is the tech internet now trapped in one bland robot voice? Meanwhile, the nerd-fight crowd went straight for the deeper blame game. tpetry basically said, “Isn’t this just the same old memory-hoarding problem people have complained about for a decade?” and suggested different tools might have avoided the whole mess. Then sfink came in with the sharp correction: no, the memory isn’t fake just because the app code isn’t touching it — if your allocator is hoarding it, your process is still using it. So the vibes were split between “AWS metrics misled everyone”, “your software stack is the real villain”, and “can we please stop writing every post like an AI conference brochure?”

Key Points

  • A customer running ONNX inference on AWS Lambda saw occasional OOM failures, with some execution environments growing from about 400 MB to roughly 9 GB before being killed.
  • Reducing the `functools.lru_cache` size from 16 to 10 and then 8 increased instability, leading to more than 270 SIGKILLs in three hours.
  • The team reduced peak memory by removing duplicate in-memory model copies and loading ONNX Runtime sessions from temporary files, cutting p50 memory from about 7.5 GB to 5 GB and improving p99 latency.
  • The Lambda metric `@maxMemoryUsed` increased monotonically across 5,949 invocations in 3 customers and 3 regions, including workloads with zero ONNX models.
  • AWS confirmed that Max Memory Used behaves as an execution-environment high-water mark, not a per-invocation metric, making it unreliable as standalone evidence of a memory leak.

Hottest takes

"god, I'm just so saturated of this writing style" — VulgarExigency
"the common solution is to use tcmalloc or jemalloc" — tpetry
"Your process is using that memory, for its allocator" — sfink
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.