Reducto releases Deep Extract

Deep Extract drops: fans hype, skeptics yell 'prove it', Gemini struts

TLDR: Reducto launched Deep Extract, a self-checking tool promising near‑perfect accuracy on long documents. Comments split fast: comparisons to DataLab, one user claims Gemini 3 Flash performs better on 300‑page files, and others joke it’s open‑source in disguise—people want proof because this could replace tedious human checks.

Reducto just unveiled Deep Extract, a document-reading bot that basically checks its own homework. Instead of one quick pass, it loops: extract, verify, fix, repeat. The company claims near-perfect accuracy (yes, 99–100% on key fields), has already pulled 28 million data fields, and says it survives monster PDFs up to 2,500 pages. It even leaves “receipts” — little bounding boxes showing exactly where each value came from — to help with audits. Translation: fewer people squinting at invoices and more bots doing the boring parts.

But the comment section turned into a product cage match in seconds. One user asked the inevitable: “How does this stack up to DataLab?” Another dev was hungry for war stories about running agents at massive scale. Then the spice hit: a former user claimed Reducto “struggled with long documents,” bragging Gemini 3 Flash is “super fast” and highly accurate on 300+ page financials. And the meme squad showed up with an XKCD-style guess-that-LLM joke, pointing at LayoutXLM like, “is this just open source in a trench coat?” The vibe: bold claims vs. bring receipts. Fans love the “agent-in-the-loop” idea; skeptics want benchmarks, side-by-sides, and fewer buzzwords, more proof.

Key Points

•Reducto launched Deep Extract, an agent-based system for structured extraction that iteratively verifies and corrects outputs.
•The approach replaces single-pass extraction with an agentic loop—extract, verify against source, identify gaps, and re-extract.
•In production beta, Deep Extract processed over 28 million fields on documents up to 2,500 pages long.
•The system reached 99–100% field accuracy for critical documents and reportedly outperformed expert human labelers.
•Deep Extract supports citations with granular bounding boxes and allows users to define correctness criteria in the system prompt.

Hottest takes

"How does this compare to DataLab" — skadamat

"We used Reducto and it did struggle with long documents." — aleks5678

"I like to play guess which LLM open source package is that XKCD comic." — cyanydeez

April 6, 2026

Agent wars: receipts or it didn’t parse

Deep Extract drops: fans hype, skeptics yell 'prove it', Gemini struts

Key Points

Hottest takes

April 6, 2026

Agent wars: receipts or it didn’t parse

Reducto releases Deep Extract

Deep Extract drops: fans hype, skeptics yell 'prove it', Gemini struts

Key Points

Hottest takes

Save News