April 10, 2026
Lockpocalypse Now
Investigating Split Locks on x86-64
One tiny lock can stall your whole PC — gamers vs. sysadmins erupt
TLDR: A badly placed atomic operation that spans two memory lines can stall modern chips; Intel’s Arrow Lake and AMD’s Zen 5 both stumble under tests. Commenters clash over Linux’s safety net that slows offending code—gamers decry lost frames while admins demand guardrails to keep shared systems stable.
One tiny misaligned “split lock” can turn a modern chip into stop‑and‑go traffic. A new test pushes a shared counter across cache‑line boundaries, making Intel Arrow Lake and AMD Zen 5 hit the brakes: Arrow Lake’s core‑to‑core jumps to 7 microseconds; Zen 5 does 500 ns but sees its bigger caches crater by 10x. Geekbench’s photo filter face‑plants; asset compression wobbles.
The comments? Pure chaos. anematode asks why anyone would ever do an unaligned atomic, while strstr drops a bomb that Linux’s split‑lock detector once “massacred” some games. That sparks the gamer‑vs‑sysadmin brawl: desktop players hate the slowdown, cloud folks love punishing the offending thread to protect everyone else. sidkshatriya says this only matters if you turn off the guardrails and still ship buggy code. Cold_Miserable shares a horror stat — 30 ns to 2000 ns on Golden Cove — while lifis asks the million‑dollar hardware question: why can’t CPUs just lock two lines? Memes fly about “bus lock = global traffic jam” and “noisy neighbors.”
Bottom line: a tiny misalignment can stall the whole party, and the crowd is split between “ban it,” “fix your code,” and “make the hardware smarter.”
Key Points
- •Split locks occur when atomic operations straddle two cache lines, forcing a slow bus lock on x86-64 CPUs.
- •Linux on newer Intel/AMD CPUs traps split locks by default and adds delays to mitigate noisy-neighbor effects.
- •A modified lock cmpxchg test that misaligns a 64-bit value across cache lines produces severe latency increases.
- •On Intel Arrow Lake, split locks push core-to-core latency to ~7 µs and roughly halve performance past L2; shared E-core L2s were unaffected.
- •On AMD Zen 5, split-lock latency is ~500 ns but L2/L3 performance drops by ~10×, heavily impacting both Geekbench 6 workloads.