December 4, 2025
100T tokens, 1000 hot takes
State of AI: An Empirical 100T Token Study with OpenRouter
100T Tokens Spill the Tea: Roleplay Rules, Small Models Vanish, and “Who Uses Grok?”
TLDR: A huge OpenRouter study of 100 trillion tokens shows “reasoning” models are rising and open‑source models are heavily used for roleplay. Commenters praise the data but debate what’s actually being counted, argue small models may be hiding off‑platform via self‑hosting, and roast “Grok Code” with jokes about who uses it.
The internet is clutching its pearls over a new 100 trillion–token usage study from OpenRouter, the hub that routes your AI chats to tons of different models. The big headline: after OpenAI’s o1 “reasoning” model (aka Strawberry) changed the game with step-by-step thinking, people’s real-world habits followed—and the data shows it.
But the comments are the main event. One camp is cheering: “Amazing data!” says typs. Another is grilling the methodology. themanmaran demands clarity on the “reasoning vs. non-reasoning” stat—are we counting secret “thinking” tokens, or just what users send and see? Translation for non-nerds: do the totals include the model’s private brain chatter, or only the words you typed and the answers you got? This matters, because those hidden steps can balloon costs.
Then comes the shocker: syspec highlights that 52% of open‑source model use is for roleplaying, supposedly thanks to looser filters and higher “creativity.” The thread oscillates between “lol that tracks” and “is OpenRouter just selecting for spicy users?” Meanwhile, lukev throws a wrench: the study shows small models are losing share, but he argues they might simply be self‑hosted off-platform, so they don’t show up in this data. And for comic relief, asadm deadpans: “Who is using Grok Code and why?”
Verdict: groundbreaking dataset, messy human reality. Reasoning models soar, roleplay reigns, and the small‑model crowd says ‘check your sample’.
Key Points
- •Before late 2024, state-of-the-art LLMs primarily used single-pass autoregressive inference, with reasoning approximated via instruction following and tool use.
- •OpenAI’s o1 (Strawberry) introduced multi-step inference with internal deliberation, planning, and refinement, improving reasoning and decision-making.
- •The study analyzes a 100 trillion token dataset from OpenRouter to provide large-scale evidence of real-world LLM usage.
- •Methodology includes categorizing tasks and models, and examining how model selection varies by region, time, pricing, and model launches.
- •Analyses focus on open vs. closed-source adoption, emergence of agentic (tool-assisted) inference, and a task category taxonomy (e.g., programming, roleplay, translation).