June 11, 2026

Victorian Bot, Modern Comment War

Making a vintage LLM from scratch

Guy builds a Victorian chatbot for $80 and the comments instantly turn into a vibe-coding trial

TLDR: A developer built a chatbot trained on old books for about $80 and shared the whole thing online. Commenters loved the obsessive DIY spirit, but some instantly argued that admitting to “vibe-coding” with AI helpers made the “from scratch” brag a lot less romantic.

A hobby coder just dropped a delightfully weird project: a homemade language model trained on old pre-1900 writing, with the goal of making a chatbot that basically talks like it escaped a Victorian library. The creator says they worked on it every single day, even while sick, built the data pipeline and training setup themselves, and kept the total cost to about $80 in rented graphics-computer time. The model is small by modern standards, open source, and proudly unpolished—which means it may confidently make things up and say historically accurate things that feel wildly offensive today. Yes, the creator knows. That’s part of the point.

But the real action is in the replies, where the mood swings from applause to side-eye in record time. One camp is cheering the sheer DIY energy: this is the kind of project people call the best way to actually learn, with one commenter comparing it to the legendary rite of passage of building Linux "from scratch" and begging for an even nerdier sequel. Another crowd is already hungry for the next episode, asking what happens when the bot gets polished into a more chat-friendly version.

Then comes the tiny burst of drama: the post admits the code was "semi-vibe-coded" with help from other AI tools, and one commenter basically slammed the brakes, saying they appreciated the honesty but felt that took some of the magic out of the journey. Translation: people love a garage-build story—until the garage has AI assistants in it. The result? A charming open-source experiment, plus a classic internet argument over what "from scratch" even means anymore.

Key Points

  • The article documents the creation of a 340M-parameter historical English LLM trained on old texts with a target knowledge cutoff of 1900.
  • The author says they built their own dataset, data-processing pipeline, base-training scripts, and fine-tuning scripts, while using existing software tools and libraries.
  • The model is based on Llama architecture, published on Hugging Face, and supported by open-source code on GitHub.
  • Local data processing and smaller training runs were done on a Linux PC, while the 340M model was trained using RunPod, ThunderCompute, and Vast.ai.
  • The author reports about $80 in GPU costs and states the model is unaligned, may hallucinate, and may produce historically accurate but offensive outputs by modern standards.

Hottest takes

"There are certain things you can only truly learn by doing" — cyberge99
"I’m curious to see how it writes after instruct" — rxm
"I appreciate the honesty, but now there's no journey" — mg794613
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.