Linux Page Faults, MMAP, and userfaultfd for faster VM boots

Linux memory “magic trick” promises faster VM restarts — cue the Windows vs. Linux bickering

TLDR: A Linux deep-dive shows how apps can “catch” missing-memory moments to quickly revive VMs by loading memory on demand. Commenters split between a battle-scarred warning that it bogs down with many CPU cores and a cheeky “Windows did it first?” jab—making clear the speed is real, but scaling and OS pride are the plot.

A deep-dive on Linux’s memory “IOU” system—where apps get pretend memory that only turns real when they touch it—lit up the comments for its big promise: faster virtual machine (VM) restores. The post explains how Linux delays real memory until first use (that “page fault” moment), and how a tool called userfaultfd lets apps catch those faults to lazily stream memory back in. Translation: VMs can boot fast while memory fills in on demand.

But the comments? Spicy. One engineer, fresh from Google Cloud’s live migration trenches, basically said: great idea, terrible at scale. Their take: once you throw lots of CPU cores at it, the kernel’s locking slows everything down, so they tried to avoid userfaultfd entirely. Cue the meme: “more cores, more problems.” Another commenter lobbed a grenade: Isn’t this just what Windows has had forever? And boom—classic OS rivalry vibes. Fans hinted Linux’s feature set has grown since kernel 4.3 (with extra goodies in 4.11), skeptics rolled eyes at “catching up” claims.

Meanwhile, the crowd giggled at a “kennel” vs. “kernel” typo, because of course the Linux dogs are herding those page faults. Bottom line: the tech is clever, the speedups are real, but the peanut gallery says beware the scaling gremlins—and the Windows drive-by comparisons.

Key Points

  • The article aims to speed VM snapshot restore by lazily populating guest memory, focusing on Linux page faults, mmap, and userfaultfd.
  • mmap creates a VMA describing a mapping but does not allocate physical pages or install page table entries until first access.
  • On first access, a missing page table entry triggers a page fault; the kernel allocates or loads a page, installs the entry, and resumes execution (demand paging).
  • Anonymous private mappings (MAP_PRIVATE | MAP_ANONYMOUS) use page allocator backing with copy-on-write on fork and are used by malloc for large allocations.
  • Anonymous shared mappings (MAP_SHARED | MAP_ANONYMOUS) are backed by tmpfs, allowing multiple processes (e.g., VMM and device backends) to share updates.

Hottest takes

"faults become veeeerry slow as the number of vcpus scales" — anlsh
"worked... to avoid it entirely" — anlsh
"Is this the same feature Windows has had forever" — dataflow
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.