GPT‑5.3‑Codex‑Spark

OpenAI’s turbo coder drops — fans cheer speed, skeptics want quality and price answers

TLDR: OpenAI launched a ultra-fast coding model preview that replies almost instantly, while also speeding up its whole system. Commenters are split between cheering the speed race with Anthropic and demanding answers on price, accuracy, and why the top model isn’t faster—memes and meta‑banter included.

OpenAI just unveiled GPT‑5.3‑Codex‑Spark, a smaller, real-time coding sidekick built to feel instant — think more than 1000 tokens per second. The community immediately turned it into a cage match: throwup238 yelled “Your move, Anthropic,” framing this as the latest round of speed one‑upmanship after Claude’s “/fast.” Others celebrated the end of the eternal compile wait, memeing the classic xkcd 303 as if coffee breaks are canceled.

But the party has a price tag mystery. OsrsNeedsf2P asked the question everyone’s thinking: is faster going to cost more, especially if accuracy dips? The spiciest pushback: behnamoh says OpenAI “solved the wrong problem,” arguing the best model is painfully slow at peak times — they want the top‑tier brain to be fast, not a smaller sprinter. Meanwhile, devs are poking at details: it’s Pro‑only preview, text‑only, a big 128k memory window, minimal auto‑changes, separate rate limits, and possible queues when crowds surge.

Under the hood drama: OpenAI claims a persistent WebSocket and stack rewrites chopped latency across the board, promising faster first words for all models soon — a sleeper win that some say matters more than Spark itself. Bonus gossip: mudkipdev wonders how one user posts HN threads at lightspeed, and cjbarber jokes about multitabling bots while you wait. Internet popcorn secured.

Key Points

  • OpenAI released a research preview of GPT‑5.3‑Codex‑Spark, a smaller model for real‑time coding, delivering over 1,000 tokens per second.
  • Codex‑Spark launches text‑only with a 128k context window and a lightweight, targeted‑edit workflow for interactive development.
  • Benchmarks (SWE‑Bench Pro, Terminal‑Bench 2.0) show strong agentic software engineering performance while completing tasks faster than GPT‑5.3‑Codex.
  • OpenAI implemented end‑to‑end latency reductions (80% roundtrip, 30% per‑token, 50% time‑to‑first‑token) via a persistent WebSocket path and Responses API optimizations.
  • Codex‑Spark runs on Cerebras’ Wafer Scale Engine 3; it is available to ChatGPT Pro users with separate rate limits and may queue during high demand.

Hottest takes

"Your move, Anthropic." — throwup238
"I want a faster, better model" — behnamoh
"No hint on pricing." — OsrsNeedsf2P
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.