MyTorch – Minimalist autograd in 450 lines of Python

Tiny DIY “MyTorch” drops, instantly sparks “Micrograd did it better” showdown

TLDR: A tiny DIY tool, MyTorch, recreates automatic math derivatives in 450 lines, wowing learners while drawing instant comparisons to Karpathy’s popular “micrograd.” The comments split between praising it as a teachable mini‑engine and dismissing it as wheel‑reinventing, with a cryptic “HmcKk” adding meme‑fuel to the debate.

A bite‑size math engine called MyTorch just dropped—only 450 lines of Python—and the internet did what it does best: compare it to the nearest celebrity cousin. MyTorch promises automatic math “slopes” (derivatives) like the big leagues, mimicking PyTorch’s style while using plain old NumPy under the hood. It even flexes higher‑order derivatives without extra setup and hints it could someday hit GPUs. That’s neat… but the crowd’s mood turned fast.

The top vibe? “Karpathy already did this.” One commenter fired off the killer line: “micrograd did it first (and better),” pointing newbies to Karpathy’s course. Cue the classic split: Team Learning Project cheering “great for understanding how the magic works,” versus Team Why Bother rolling eyes at yet another remake. The author’s own aside—rewriting it in low‑level code would be “interesting (but useless)”—became a punchline, with readers leaning into the “useless but fun” energy. And then there’s the cryptic “HmcKk” comment, which instantly read like a secret code name for a new optimizer or just a keyboard face‑plant. In short: MyTorch is a tidy teaching toy that reignited the eternal debate—build it yourself to learn, or stop reinventing wheels and just use the famous one. Drama served with derivatives, anyone?

Key Points

  • mytorch is a minimalist autograd library (~450 lines of Python) with a PyTorch-like API.
  • It uses NumPy for computations and implements graph-based reverse-mode autodiff similar to PyTorch.
  • It supports arbitrarily high-order derivatives for scalars and non-scalars.
  • Both torch.autograd.backward and torch.autograd.grad are supported; higher-order derivatives do not require create_graph=True.
  • Examples demonstrate scalar and broadcasting gradient calculations, with suggested future extensions (torch.nn, GPU via CuPy/Numba, low-level BLAS rewrite).

Hottest takes

HmcKk — jjzkkj
Karpathy’s micrograd did it first (and better) — jerkstate
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.