November 7, 2025
1000 tokens, 1000 takes
Cerebras Code now supports GLM 4.6 at 1000 tokens/sec
Cerebras touts ‘1000 tokens/sec’ coding—commenters ask if it’s real, worth $50, or just vibes
TLDR: Cerebras says its coding AI runs GLM‑4.6 at over 1,000 tokens per second, with plans from free to $200. The comments demand proof, question pricing and hidden tricks, spin up a SWE‑1.5 conspiracy, and meme the launch—asking if speed alone is worth $50 and whether quality keeps up.
Cerebras came sprinting into the chat claiming its code AI now runs GLM‑4.6 at “1,000+ tokens per second,” pitching it as the fastest way to code and pairing it with a Free tier, a $50 Pro plan, and a $200 Max plan. It even flexed fresh funding with a Series G raise. GLM‑4.6 is billed as top-tier—“#1 for tool calling” and comparable to Sonnet 4.5—but the internet’s reaction? Speed hype meets trust issues.
Skeptics immediately poked holes: Is that just the rate it spits out text, not how fast it thinks? One user asked if Cerebras is using “speculative decoding” (a speed trick that guesses ahead) or lossy quantization (compressing the math) to hit those numbers. Another dragged pricing: at $50/month, this better be lightning—especially when rivals like Claude and ChatGPT are cheaper. The Groq comparison came up fast, along with the classic: “We have no way to prove it.”
Then came the detective subplot: a claim that Cognition’s SWE‑1.5 might be a GLM‑4.6 finetune, sending model-spotters into conspiracy mode. And the meme machine revved up with the instant classic: “Vibe Slopping at 1000 tokens per second.” Meanwhile, fans like the bring-your-own editor support (Cline, RooCode, and more) and the idea of “staying in flow.” But the room’s energy? Prove it, price it right, and don’t just go fast—be good.
Key Points
- •Cerebras now runs GLM‑4.6 for code generation, advertising 1,000+ tokens per second.
- •GLM‑4.6 is claimed to be #1 for tool calling on the Berkeley Function Calling Leaderboard and comparable to Sonnet 4.5 for web development.
- •Cerebras Code Pro supports a BYO editor approach, with compatibility listed for Cline, RooCode, OpenCode, and Crush.
- •Pricing tiers: Free ($0, limited usage), Pro ($50, up to 24M tokens/day), Max ($200, up to 120M tokens/day).
- •The page links to a press release noting Cerebras’ $1.1B Series G at an $8.1B valuation and provides the company’s Sunnyvale, CA address.