December 29, 2025
Byte-sized banter, mega nostalgia
Show HN: Z80-μLM, a 'Conversational AI' That Fits in 40KB
Retro chip gets sassy: a 40KB chatbot makes the internet nostalgic and noisy
TLDR: A cheeky chatbot squeezed into 40KB on a 1976-era chip has the crowd buzzing. People want a simulator and Game Boy port, advice is flying on safe training data, and skeptics quibble over whether it’s “real AI,” proving retro charm can ignite very modern debates.
Retro fans are losing it over Z80-μLM, a tiny “conversational AI” squeezed into a 40KB program for a late‑70s chip. It types out replies one character at a time and mostly says things like “OK” or “MAYBE,” and yes, that’s part of the charm. The vibe in the Show HN thread is equal parts giggles and nostalgia: “An LLM in a .com file? Haha made my day,” cheered one commenter, while another dreamed of a Game Boy version and the sweet glow of a green screen. The practical crowd showed up too, begging for a simulator so they can try it without soldering anything: “Would love to see a Z80 simulator,” one wrote.
Cue the mini‑drama: a few readers question calling it “AI” when it can’t hold a full chat, but supporters clap back that personality beats paragraphs on 4MHz hardware. Data‑policy nerds jump in with advice: use permissive large language models (LLMs) to create training data and “don’t stress about breaking TOS” (terms of service) or “C&D” (cease and desist). Meanwhile, homebrew builders flex, promising to run it on their own DIY Z80 boards. Verdict from the crowd: tiny tech, big vibes, and a retro experiment that’s way more fun than it has any right to be.
Key Points
- •Z80-μLM is a ~40KB .COM conversational micro language model running on a 4MHz Z80 with 64KB RAM.
- •It uses trigram hash encoding into 128 buckets and 2-bit weight quantization for compact inference.
- •Inference is integer-only (16-bit) with fixed-point scaling and no floating point, fitting CP/M’s TPA.
- •Two examples—tinychat and guess—demonstrate terse conversational and 20 Questions behaviors.
- •Architecture includes 128 query and 128 context buckets, configurable hidden layers, ReLU activations, and character-level autoregressive output.