April 6, 2026
Agent wars: receipts or it didn’t parse
Reducto releases Deep Extract
Deep Extract drops: fans hype, skeptics yell 'prove it', Gemini struts
TLDR: Reducto launched Deep Extract, a self-checking tool promising near‑perfect accuracy on long documents. Comments split fast: comparisons to DataLab, one user claims Gemini 3 Flash performs better on 300‑page files, and others joke it’s open‑source in disguise—people want proof because this could replace tedious human checks.
Reducto just unveiled Deep Extract, a document-reading bot that basically checks its own homework. Instead of one quick pass, it loops: extract, verify, fix, repeat. The company claims near-perfect accuracy (yes, 99–100% on key fields), has already pulled 28 million data fields, and says it survives monster PDFs up to 2,500 pages. It even leaves “receipts” — little bounding boxes showing exactly where each value came from — to help with audits. Translation: fewer people squinting at invoices and more bots doing the boring parts.
But the comment section turned into a product cage match in seconds. One user asked the inevitable: “How does this stack up to DataLab?” Another dev was hungry for war stories about running agents at massive scale. Then the spice hit: a former user claimed Reducto “struggled with long documents,” bragging Gemini 3 Flash is “super fast” and highly accurate on 300+ page financials. And the meme squad showed up with an XKCD-style guess-that-LLM joke, pointing at LayoutXLM like, “is this just open source in a trench coat?” The vibe: bold claims vs. bring receipts. Fans love the “agent-in-the-loop” idea; skeptics want benchmarks, side-by-sides, and fewer buzzwords, more proof.
Key Points
- •Reducto launched Deep Extract, an agent-based system for structured extraction that iteratively verifies and corrects outputs.
- •The approach replaces single-pass extraction with an agentic loop—extract, verify against source, identify gaps, and re-extract.
- •In production beta, Deep Extract processed over 28 million fields on documents up to 2,500 pages long.
- •The system reached 99–100% field accuracy for critical documents and reportedly outperformed expert human labelers.
- •Deep Extract supports citations with granular bounding boxes and allows users to define correctness criteria in the system prompt.