Mr. Chatterbox is a Victorian-era ethically trained model

Victorian AI you can run at home: charming accent, little sense

TLDR: A tiny, laptop-friendly AI trained only on Victorian books is out, and it’s more charming than useful. Commenters split between “it’s too small” and “the old books made it weird,” while others mock the “ethically trained” phrasing—still, it’s a clean-data proof that sparks big debates about quality vs ethics.

Meet Mr. Chatterbox, a tiny do-it-yourself AI trained only on 19th‑century British books—no modern internet, just dusty library vibes. Creator Trip Venturella dropped the model and a demo, and even Simon Willison summed it up as “pretty terrible… but fun.” And the crowd? They’re here for the spectacle.

The top reaction: “Is it me, or does this thing not make sense?” One tester joked they either don’t speak Victorian or 340 million “brain cells” just isn’t enough. Others fixated on the phrase “ethically trained.” Many thought it meant ethics content; turns out it means trained on public-domain books—no copyright drama. Cue the “ethics bait-and-switch” memes.

Then came the nerdfight: small model vs old-timey data. One camp says the model is simply too tiny; another insists the vibe is set by the bookish training data—more Dickens, less direct answers. The “prior art police” crashed in with TimeCapsuleLLM, declaring this a growing genre: time-travel AIs that talk fancy but dodge questions.

Still, people love that it’s 2GB and runs on a laptop via LLM. It’s steampunk Clippy: useless at email, delightful at exclaiming “Good day, sir!”—and proof you can build a copyright-clean chatbot, even if it occasionally sounds like a very polite word salad.

Key Points

•Mr. Chatterbox is a 340M-parameter language model trained solely on 1837–1899 British Library texts (28,035 books; ~2.93B tokens).
•The model is available on Hugging Face (2.05GB) with a Hugging Face Spaces demo for testing.
•Simon Willison found the model’s responses weak and not very useful, likening them to a Markov chain.
•Citing the Chinchilla scaling rule, the article notes ~7B tokens may be needed for a 340M model, exceeding the available corpus.
•Willison built an LLM plugin (llm-mrchatterbox) to run the model locally using nanochat, with setup commands provided.

Hottest takes

either a) i dont understand Victorian speech very well or b) a model with 340million parameters doesn't generate particularly coherent speech — parpfish

Turns out "ethically trained" means the training data used doesn't violate copyright laws. — lovelearning

Looks like a model size issue, but the behavior already seems largely shaped by the data distribution. — heyethan

March 30, 2026

Tea, crumpets, and chaos

Victorian AI you can run at home: charming accent, little sense

Key Points

Hottest takes

March 30, 2026

Tea, crumpets, and chaos

Mr. Chatterbox is a Victorian-era ethically trained model

Victorian AI you can run at home: charming accent, little sense

Key Points

Hottest takes

Save News