April 18, 2026
Zero copies, max drama
Zero-Copy GPU Inference from WebAssembly on Apple Silicon
Macs get no‑copy AI boost — but devs shout “not in browsers” and “what about security”
TLDR: A dev showed that on Apple Silicon, WebAssembly code and the GPU can share the same memory for faster AI, no data copies needed. Commenters loved the speed but clashed over limits (it’s not in browsers), security worries, and whether you should just write native apps instead.
A solo dev claims a slick speed trick on Apple Silicon: WebAssembly (tiny, portable code that runs in a sandbox) and the GPU share the same memory, so data isn’t copied at all. The result? Zero‑copy AI inference with WebAssembly via Wasmtime and Apple’s Metal. It’s the backbone of an early project called Driftwood — and the internet immediately split into camps.
Skeptics pounced. User wmf threw cold water with a blunt reminder: this is native Wasmtime, not browsers, so don’t expect your web app to get free speed. Others, like saagarjha, pressed the practical angle: if this runs natively anyway, why not just write native code and skip WebAssembly? Meanwhile pjmlp lit up the thread with a spicy “Goodbye Wasm security” take, warning that sharing memory feels like a sandbox party foul. Defenders fired back that you still choose what to share — but the side‑eye was real.
Then the memes rolled in. Old‑school devs joked this is just “hello again” to how 8‑bit and 16‑bit consoles worked. One commenter ranted about AI‑generated prose (meta!), while others dubbed it “one pointer to rule them all.” Verdict: technically cool, browser‑limited, and drama‑rich — aka perfect internet news.
Key Points
- •On Apple Silicon, a WebAssembly module’s linear memory can be directly shared with the GPU, enabling zero-copy data flow.
- •The approach relies on Apple’s Unified Memory Architecture, where CPU and GPU access the same physical memory.
- •mmap on ARM64 macOS provides 16 KB-aligned memory required by Metal.
- •Metal’s makeBuffer(bytesNoCopy:length:) wraps existing pointers without copying; pointer identity and RSS measurements confirm zero-copy.
- •Wasmtime’s MemoryCreator trait allows the Wasm linear memory to be backed by the same mmap region, enabling end-to-end in-place GPU computation.