November 1, 2025
Cache me outside, how ’bout dat
Myths Programmers Believe about CPU Caches
Old cache myths resurface: devs feud over what’s really broken
TLDR: A revived blog says modern CPUs keep caches in sync and that “volatile” isn’t a slow path to memory. Commenters clash over whether the real culprit is cache behavior or memory reordering, with x86-vs-ARM debates and “terrible docs” jokes proving why this still confuses developers.
A 2018 blog on CPU “cache myths” just resurfaced, and the internet immediately dusted off its boxing gloves. The post says modern chips keep their tiny fast memories in sync (called “coherency”) and that “volatile” reads aren’t always slow trips to main memory. That sounds calming—until the crowd roared. One commenter linked the old HN thread, sparking a chorus of “we’ve been here before,” while others insisted the confusion still misleads devs today.
Then came the split: critics argued the real monster isn’t caches going out of sync, it’s CPUs reordering reads and writes behind your back—meaning you still need special “fences” to make shared data safe. Cue the coherency vs. ordering cage match. One voice snarked that without “meddling software” we’d live in a perfect, synchronized world, and dunked on C++’s notoriously confusing memory rules as “nightmarish.” Another asked if things work totally differently on ARM chips (the ones in many phones), igniting the classic x86 vs. ARM flame war. A bold soul even floated the idea of letting apps control cache rules directly—met with a collective “please don’t.” The vibe: fascinating lesson, sure—but the comments are where the real education (and chaos) happens.
Key Points
- •Modern x86 CPUs maintain cache coherency across cores via hardware protocols, keeping caches in sync.
- •The common belief that stale cache values across cores are the primary source of concurrency issues is misleading.
- •Java’s volatile does not force all reads/writes to main memory; volatile reads can be as fast as L1 cache accesses.
- •Main-memory access is roughly 200 times slower than L1 cache, countering the notion that volatile routinely bypasses caches.
- •Concurrency bugs can occur even on single-core systems if appropriate synchronization constructs are not used.