November 24, 2025
When memory gaslights you
Ask HN: Scheduling stateful nodes when MMAP makes memory accounting a lie
Server meltdown sparks 'just use Kubernetes' vs 'let nodes say no' brawl
TLDR: A coordinator misread “low rows” as low load and kept hammering a nearly full server, causing a loop. Commenters split between Kubernetes-style reservations, latency-based backpressure, OS signals like PSI, and cheeky calls for ML to babysit metrics—because naive measurements can crash real systems.
A spicy Ask HN confessional lit up the crowd: a coordinator kept shoving data onto a “quiet” server because it had fewer rows, but the box was actually stuffed to the brim with chunky data and near out-of-memory. Cue chaos: the coordinator ignored the “I’m full” signals and basically DDOS’d its own node. The community’s verdict? Row count is a lie, memory is a drama queen, and your scheduler needs therapy.
Team Kubernetes swaggered in first: declare memory reservations so the scheduler treats your capacity like hard facts, regardless of lazy-loaded trickery. The performance purists snapped back: let latency be the truth—if a node gets slow or jittery, feed it less and close the loop to the balancer. The OS whisperers pulled out Pressure Stall Information (PSI), a Linux stat that shows when CPU and memory are starved, with a “look at active pages” wink. Then the chaos agents piled on with the hottest take: “NP-hard? Perfect for machine learning,” turning the “God Equation” meme into “let AI parent your cluster.”
Meanwhile, pragmatists cheered a “dumb coordinator, smart nodes” plan: fire by disk space, let workers 429 (Too Many Requests) when stressed, and separate disk balancing from memory-heavy query work. Peak meme: “mmap is gaslighting your RAM.” Checkmate.
Key Points
- •A distributed stateful engine uses a Coordinator to assign data segments to Worker Nodes, with heavy reliance on mmap and lazy loading.
- •A failure occurred when the Coordinator misread Node A’s low logical row count as underutilization and repeatedly tried to load new segments.
- •Node A was near OOM (~197GB RAM) due to very wide rows and large blobs, making row count a poor proxy for resource usage.
- •OS page cache and lazy loading made application-level RSS and disk metrics unreliable for memory-aware scheduling.
- •The author proposes options: rely on node-enforced backpressure (HTTP 429), build per-segment cost models, or decouple storage balancing from query/memory balancing, and seeks references.