April 5, 2026
Hook, line, and chatbot
Show HN: I built a tiny LLM to demystify how language models work
Build‑your‑own fishy chatbot in 5 minutes has HN hooked
TLDR: GuppyLM is a tiny fish‑themed chatbot you can train in minutes, meant to show how language models work. The community is charmed by its simplicity and jokes, split between praising it as a perfect teaching tool and teasing it as a cute toy that sparks memes more than breakthroughs.
Meet GuppyLM, the tiny “talks-like-a-fish” chatbot that trains in about five minutes and swims straight into your browser—and the crowd is losing it. The vibe: adorable, educational, and hilariously weird. Fans say it proves you don’t need a PhD or pricey gear to peek inside how chatbots work.
The top cheer comes from a philosophy flex: one commenter praised the project’s “nod to Nagel,” celebrating how limiting Guppy to a fish’s world makes its brain easy to grasp—no money, no politics, just bubbles, food, and vibes. Meanwhile, the jokes and memes are schooling in: “Call it DORY!” shouted one user, while another proposed an emoji‑only fish personality. The line everyone’s quoting? “you’re my favorite big shape. my mouth are happy when you’re here.” Instant catchphrase.
Of course, it’s not all smooth waters. A few skeptics argue this is a cute toy that won’t teach the messy parts of big‑league AI. But defenders clap back: that’s the point—a tiny, simple model that shows every moving part is the best first swim. Others chimed in with their own DIYs, including a Milton‑themed model link, and folks poked around the synthetic dataset on Hugging Face link. Verdict: Guppy isn’t deep—but it’s delightfully clear, and very, very memeable.
Key Points
- •GuppyLM is an 8.7M-parameter, fish-themed language model created to teach how LLMs work via a minimal, transparent pipeline.
- •The model uses a simple 6-layer vanilla transformer (384 hidden, 6 heads, FFN 768 ReLU) with a 4,096 BPE vocab and 128-token context.
- •It trains from scratch on 60K synthetic conversation samples across 60 topics (57K train/3K test), generated via template composition.
- •A pre-trained model and dataset are hosted on HuggingFace; users can chat immediately or train their own in ~5 minutes on a Colab T4 GPU.
- •The codebase includes data generation, tokenizer training, training loop (cosine LR, AMP), and inference, enabling end-to-end understanding.