Apple Silicon costs LESS than OpenRouter

Turns out the fancy Mac may beat the AI middleman—and commenters are fighting over the title

TLDR: A new cost breakdown says a high-end Apple laptop can be slightly cheaper than paying an AI service, and in some cases much cheaper. Commenters, however, were obsessed with the bigger scandal: whether the headline got flipped, turned into clickbait, or restarted yesterday’s exact same argument.

A humble laptop cost comparison somehow turned into a full-on comment-section cage match. The big claim: if you run artificial intelligence tools locally on a high-end Apple laptop, it can actually be cheaper than paying a service like OpenRouter, especially when you count real-life usage, reuse of repeated prompts, and the fact that a MacBook can later be resold. In the author’s math, the gap is small for one model but dramatic for another—up to 3x cheaper in the best case.

But the community barely made it past the headline before the drama started. One commenter flatly declared, “You managed to reverse the title somehow,” while another fired off the internet’s favorite accusation: “clickbait.” That set the tone fast. The original poster jumped in to clarify that the title was supposed to say Apple Silicon costs less than OpenRouter, suggesting the wording may have been altered because it referenced an earlier Hacker News debate. In other words: the spreadsheet war instantly became a headline war.

Then came the smug-but-not-wrong energy. One commenter shrugged that of course Apple’s chips are cheaper—they’re also cheaper than renting cloud computers for general work. Another dropped a link to a very similar thread from the day before, giving the whole thing a faint “we’re doing this again?” vibe. So yes, there’s a real pricing argument here—but the real spectacle was everyone arguing over whether the post proved something shocking, obvious, or just badly titled.

Key Points

  • The article argues that a fairer local-versus-hosted LLM cost comparison should include input/output token mix, batching, concurrency, caching, and hardware residual value.
  • A benchmark on an M4 Max 128GB system for Gemma 4 31B reported 157.3 total tokens per second and led to an estimated blended local cost of about $0.14 per million tokens versus about $0.16 on OpenRouter.
  • For Gemma 4 31B, the article lists local cost scenarios of about $0.15, $0.14, and $0.13 per million tokens over 3, 5, and 7 years respectively.
  • A second benchmark for Gemma 4 26B MoE reported 580.72 total tokens per second and an estimated blended local cost of about $0.038 per million tokens versus about $0.10 on OpenRouter.
  • The article says local LLM inference could become more important due to GPU supply constraints and also offers privacy benefits.

Hottest takes

"reverse the title somehow" — gavinsyancey
"clickbait" — est
"It is also cheaper than EC2 for general compute" — dnnddidiej
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.