Subquadratic – Introducing SubQ 1.1 Small

AI startup says its model can read giant files fast — commenters want receipts

TLDR: Subquadratic says its new AI can read huge document piles far faster and cheaper than usual, which could matter for businesses drowning in paperwork and code. Commenters weren’t focused on the win so much as the mystery: exciting numbers, yes — but where are the details, and should anyone trust a black box?

Subquadratic rolled in with a big promise: its new AI model, SubQ 1.1 Small, can handle enormous amounts of text — think whole code projects, giant contract stacks, or mountains of financial paperwork — without melting the budget. The company says it can search through up to 12 million tokens of text with almost perfect accuracy on some tests, while using dramatically less computing power and running much faster than common methods. In plain English: they’re pitching a cheaper, speedier way for AI to read the entire pile instead of chopping it into little pieces first.

But the real action was in the comments, where the vibe quickly turned into “cool story, now show us how it works.” One camp was impressed by the dream here: if this kind of efficiency is real, maybe powerful AI tools finally stop costing a fortune. One commenter basically said the future isn’t just smarter models — it’s getting today’s best ones to be way cheaper. The other camp was much louder and sharper: why so few details? More than one reader sounded openly suspicious, calling out the lack of technical specifics and saying that when other labs are sharing more openly, secrecy makes trust harder. That turned the launch into a mini-drama about hype versus proof.

Even the driest comment got meme energy: one user simply dropped the model card PDF, which felt a bit like tossing court evidence onto the table. The mood? Curious, hopeful, and side-eyeing hard.

Key Points

  • Subquadratic released the model card for SubQ 1.1 Small, the second iteration of its Subquadratic Sparse Attention model at the smallest size.
  • The company says SubQ 1.1 Small achieves near-perfect Needle-In-A-Haystack retrieval at 1M, 2M, 6M, and 12M tokens.
  • On benchmarks cited in the article, the model scored 99.12% on RULER at 128K, 85.4% on GPQA Diamond, 89.7% pass@4 on LiveCodeBench, and 13% on AutomationBench Finance.
  • Subquadratic says SSA replaces quadratic dense attention with a sparse formulation that scales linearly with context length.
  • At 1M tokens, the article claims SubQ 1.1 Small uses 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2 on a single attention layer.

Hottest takes

"the next frontier for LLMs should really just be... cost drastically less" — giancarlostoro
"Disappointing they don’t actually say how their sparse attention mechanism works" — aesthesia
"the lack of details makes me default not trust this" — cmogni1
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.