June 21, 2026

Household AI, comment-section chaos

Good results fine tuning a local LLM like Qwen 3:0.6B to categorize questions

Tiny home AI shocks commenters as fans cheer and skeptics yell “just use old-school tools”

TLDR: A developer trained a very small local AI to sort home-related questions into categories before searching family records, after the base version performed terribly on its own. Commenters split fast: some loved the tiny model’s speed and potential, while others argued this is exactly the kind of job old-school software already does better.

A hobby project about a mini AI sorting household questions somehow turned into a full-on comments-section showdown. The setup is charmingly domestic: one larger local model answers questions about things like pool repairs and doctor appointments, while a teeny tiny model is trained to tag each question into buckets like “pool,” “car,” or “heating.” The surprise? The smallest model, Qwen 0.6B, was apparently awful at first — only about 10% correct without extra training — which immediately gave the crowd something to pounce on.

That’s where the drama kicked in. One camp was delighted that such a small model could have a real job, with one commenter practically cheering that Qwen 0.6B is “super fast” and has a clear niche if tuned properly. Another crowd was far less impressed, basically saying: why bring in AI theater for a boring sorting task when a tiny traditional text classifier could probably do the same job faster, cheaper, and in under a minute? That sparked the classic internet fight: cute experimental AI project vs. boring practical tool that may actually work better.

Then came the peanut gallery of power users, tossing in side quests like Gemma 3 270M, grammar-locked outputs, and deeper training tricks. Even the model’s habit of making up categories became part of the fun — the machine was apparently freelancing labels like an overconfident intern. In other words, the article is about fine-tuning a local helper, but the comments are really about a much juicier question: is tiny AI genuinely useful, or are people reinventing a problem we already solved years ago?

Key Points

  • The article describes a personal project to build a household chatbot that uses RAG and metadata-aware retrieval.
  • A preprocessing classifier maps user questions to categories such as pool, car, hvac, and cooking before vector search.
  • The experiment tests whether Qwen 3:0.6B can be fine-tuned into a reliable classifier for household-related questions.
  • The training setup uses Unsloth and an initial dataset of about 850 labeled examples split into train, evaluation, and test sets.
  • The baseline prompted version of Qwen 0.6B achieved 13 correct answers out of 131 integration tests, or about 10% accuracy.

Hottest takes

"Scikit Learn with a SGDClassifier on 2-grams will do probably just as well" — nl
"Qwen 0.6B is so cool... super fast" — jszymborski
"The model invents new categories" — nextaccountic
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.