April 18, 2026
When math goes mini, the takes go maxi
4-bit floating point FP4
Tiny numbers, big feelings: GPUs rejoice, purists nitpick
TLDR: A post explains 4‑bit “FP4” floats—tiny numbers used to make AI models faster and smaller—highlighting the E2M1 format and emulation tools. The comments light up with GPU pragmatists cheering, history buffs correcting 80‑bit lore, and a mini‑debate over which 4‑bit format actually matters, from NF4 to MXFP4.
Meet FP4, the 4‑bit floating point format that packs math into a crumb of memory—and packs the comments with drama. The post walks through how FP4 works (think: a tiny sign, a tiny exponent, a tiny fraction) and even shows a table where, yes, there are two zeros (+0 and −0). It name‑drops Nvidia support for the E2M1 flavor and mentions Pychop, a Python library that pretends to be all these tiny formats.
But the community? Pure chaos. One camp is shouting “less is more,” arguing small numbers mean faster AI and happier GPUs. chrisjj fires the first shot with a spicy “someone didn’t try it on GPU,” dunking on the idea that more precision is always better. The history buffs charge in with ant6n’s “actually” about the old 80‑bit x87 days, because of course the past is complicated. Meanwhile, burnt‑resistor goes full deep‑dive with format anatomy, prompting half the thread to nod and the other half to blink slowly.
Then the plot twist: confusion over NF4, NVFP4, and MXFP4—different 4‑bit formats used in AI, sometimes grouped in batches—has readers asking for a guide they admit they won’t research. conaclos drops a minifloat link and the crowd discovers even 4‑bit floats can have infinities, NaNs, and memes about negative zero. Tiny floats, maximum feelings.
Key Points
- •The article explains motivations for low-precision floating point in neural networks, emphasizing memory efficiency over precision.
- •FP4 uses 1 sign bit and splits the remaining 3 bits between exponent and mantissa; variants include E3M0, E2M1, E1M2, and E0M3.
- •E2M1 is the most common FP4 format and is supported in Nvidia hardware; a full enumeration of its 16 values is provided.
- •Numeric mapping uses (−1)^s · 2^(e−b) · (1 + m/2) with a bias b; for e = 0, m = 0 encodes 0 and m = 1 encodes 1/2.
- •Value spacing differs by format: E3M0 is log-uniform, E0M3 is linear-uniform, and E1M2/E2M1 are uneven; both +0 and −0 exist in FP4.