April 5, 2026

Code, controversy, and Claude cosplay

Nanocode: The best Claude Code that $200 can buy in pure JAX on TPUs

$200 DIY “Claude Code” sparks big brain, bigger drama vibes

TLDR: Nanocode claims you can train a mini coding assistant for about $200 using Google’s chips, borrowing a simple method from bigger AI labs. The comments erupted over cost, a possibly wrong demo answer, and confusion about whether “Claude Code” is even trainable—making this both a how-to and a hot debate.

A scrappy open-source project called nanocode promises a DIY coding assistant you can train yourself for about $200 on Google’s cloud chips. The pitch: follow a “teach-it-your-values” recipe, add code-heavy data, and boom—your own mini assistant. But the comments turned this into a full-on tech soap opera. The top vibe? “Why pay when free models exist?” One skeptic asked why anyone would drop $200 when plenty of coding bots don’t cost a dime, and that question echoed loud.

Then came the nitpicks. A sharp-eyed reader flagged that the demo answer for a “don’t make a new list” Python task… used a new list. Oops. Meanwhile, a terminology brawl erupted: what does “train your own Claude Code” even mean? One commenter insisted Claude Code is just a “harness” (a wrapper that calls other models), not something you literally train. Cue confusion, clarifications, and a link reminding everyone not to mix it up with nanocoder.

Still, not all snark: a newcomer praised the write-up as “digestible,” and fans loved the Karpathy-inspired simplicity and TPU-friendly price tag. Bottom line: the project’s bold, the demo’s wobbly, and the crowd’s split between “awesome hands-on guide” and “marketing buzzword bingo.”

Key Points

  • nanocode is an open-source JAX-based library to train a Claude Code–style coding assistant using a simplified Constitutional AI approach, adapted from Karpathy’s nanochat.
  • The project is optimized for TPUs and provides a low-cost reproduction path: ~9 hours/$200 for a 1.3B model (nanocode-d24) on a TPU v6e-8, and ~1.5 hours/$34 for a 477M model (nanocode-d20).
  • Users can leverage Google’s TRC program for free pre-emptible TPUs and Google Cloud credits to reduce costs; NVIDIA GPUs are supported but TPUs are preferred.
  • Training mirrors nanochat’s workflow, with added The Stack-V2 code data at a 1:5 ratio in pre-training and tokenizer mixtures; scripts are provided for dataset downloads and tokenizer training/evaluation.
  • Tokenizer comparisons show significant gains for code tokenization (about 50.9% improvement) with some trade-offs in general text categories.

Hottest takes

"Why would people want to spend $200 to train a coding model when there are free coding models?" — bdbdbdb
"Is the generated python code in the example wrong?" — wwfn
"It cannot be 'trained'." — vova_hn2
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.