March 1, 2026
Autocomplete or actual brain?
Microgpt explained interactively
Tiny DIY AI explainer thrills newbies, sparks “Name‑gate” and a “not beginner-friendly” brawl
TLDR: A 200‑line “build a mini ChatGPT” explainer wowed readers with interactive basics, but comments exploded over “Name‑gate,” beginner‑unfriendly writing, and whether guesswork can become reasoning. It matters because it shows how AI works while exposing the gap between simple demos and real‑world expectations
Andrej Karpathy’s bite-size “build a mini ChatGPT in 200 lines” gets an interactive tour, and the crowd came READY. The explainer walks through simple building blocks—turning letters into numbers, predicting the next character, and a softmax step that turns scores into probabilities—like an autocomplete on steroids. It even trains on baby names to show how a tiny model can invent new ones.
But then came Name‑gate: one reader claims the “made‑up” names (like “kamon” and “anna”) are actually in the dataset, calling for receipts. Another big thread asks the existential question: is this toy demo just smart guessing, or can it really turn into the code‑debugging whiz we use today? As one put it, “How does guessing letters become reasoning?” Meanwhile, the tone police arrived with the meme‑hammer—calling parts of it “draw the rest of the owl,” and saying the “beginner” tag stretches the truth with long, mathy paragraphs. Fans fought back by praising the slick interactive bits and linking the original post, insisting it demystifies the magic.
So yes, it’s a tiny AI demo—but the real show is the comment arena: reasoning vs. autocomplete, beginner‑friendly vs. brain‑dump, and a sprinkle of name drama. Press X to softmax, folks
Key Points
- •The article interactively explains Karpathy’s ~200-line microgpt Python script that trains and runs a minimal GPT from scratch.
- •Training uses a dataset of ~32,000 names, learning character-level patterns to generate plausible new names.
- •A simple character-level tokenizer maps letters to integers and uses a BOS token; production systems use tokenizers like tiktoken for efficiency.
- •The training task is next-token prediction over sequences using a sliding window of context and targets.
- •Softmax converts logits to probabilities, demonstrated with code that normalizes after subtracting the maximum for stability.