Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?

Offline AI coding is tempting, but the comments say speed still rules the drama

TLDR: A developer says offline AI can already help with daily coding, giving a real speed boost without sending private data to the cloud. But the comments turn it into a showdown: people love the privacy dream, yet many say local tools are still slower, fussier, and nowhere near as effortless as the big online services.

One developer strolled into the chat with a very bold claim: yes, they’ve actually swapped big online AI coding helpers for a fully offline setup running on their own machines. The pitch is irresistible even to non-nerds: more privacy, no usage fees, and still a serious productivity boost. But the twist? Even the original poster admits the local assistant behaves less like a wise expert and more like an eager junior coworker who needs very clear instructions or it starts making messy shortcuts and wandering in circles.

That confession lit up the comments, where the mood was basically: love the dream, doubt the current reality. One camp said local models are still just too slow compared to cloud tools, with complaints that setting them up is a pain and choosing the right one feels like guesswork. Another commenter flexed hard with a monster dual-GPU rig and called their setup “blazing fast,” which brought the classic internet energy of: cool story, but that’s not exactly normal-person hardware. Then came the comedy gold: one user described leaving giant models running overnight for a single coding task at a glacial pace, while saying online tools handle it “like it’s nothing.” Ouch.

The hottest takeaway from the crowd is that privacy and freedom are winning hearts, but speed and convenience are still winning workflows. Even the practical advice had a chaotic vibe: try a model marketplace, wait for ds4 to mature, or accept that everyone’s needs are different. In other words, local AI is the scrappy underdog everyone wants to root for — but the comments are not ready to crown it king just yet.

Key Points

  • The developer replaced cloud coding models with a fully offline local setup to prioritize data privacy and avoid usage costs.
  • The setup runs the Pi coding harness in a containerized, sandboxed environment on a Mac Studio with 128GB RAM or a MacBook with 36GB RAM.
  • Qwen3.6 35b is the primary model used for coding, while Qwen3.5 122b is used for more complex tasks but is significantly slower.
  • The local workflow was used to complete a redesign of a website homepage and blog built with Django and Wagtail.
  • The author reports that local Qwen models require more precise prompting, can loop or mishandle edit calls, and deliver an estimated 5x speedup versus 15x for Claude Opus.

Hottest takes

"blazing fast but it’s mostly habit that keeps me with CC and Code" — arjie
"the cloud models all play with that like its nothing" — HappySweeney
"Every person have different needs and expectations" — kertoip_1
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.