Show HN: Robust LLM Extractor for Websites in TypeScript

A new open‑source tool, Lightfeed Extractor, promises to let an AI read websites like a human and pull out clean data for your spreadsheets. It even runs a stealthy browser to dodge blocks and converts messy webpages into tidy markdown to save AI costs. Cool tech… until the comments exploded.

The mood? Split right down the middle. Privacy hawks are fuming over the “stealth mode” brag and the vibe that robots.txt—the polite “no trespassing” sign of the web—might be ignored. One user deadpanned, “Robots.txt anyone?” while another accused the project of not caring at all. Meanwhile, data‑hungry devs are drooling over token savings from converting HTML to markdown, but worry the cleanup could break things—what happens to tables, ratings, or tiny details? “Do you lose info?” becomes the nervous refrain.

Then there’s the JSON drama. The tool claims it can recover broken AI‑generated JSON (those curly‑bracket meltdowns that crash pipelines). Commenters chimed in with a twist: some say that’s why other AI tools use XML—because closing tags keep the robot on track. Cue meme: “One bad bracket and your day is ruined.”

Final act: anti‑bot patches. Curious devs ask how often sites actually block this thing. Critics smell a cat‑and‑mouse game; fans call it a superpower for tracking prices and products. Either way, the web’s new data vacuum just rolled into aisle 5.

March 25, 2026

Bots, brackets, and a bread aisle brawl

Dev tool or data heist? New AI scraper sparks robots.txt war

TLDR: Lightfeed Extractor uses AI and a stealthy browser to scrape websites and export clean data. The crowd is split: some cheer the token savings and “fix my broken JSON” features, while others slam the anti‑bot vibe and robots.txt concerns, sparking an ethics vs. efficiency showdown.

Key Points

Hottest takes

March 25, 2026

Bots, brackets, and a bread aisle brawl

Show HN: Robust LLM Extractor for Websites in TypeScript

Dev tool or data heist? New AI scraper sparks robots.txt war

TLDR: Lightfeed Extractor uses AI and a stealthy browser to scrape websites and export clean data. The crowd is split: some cheer the token savings and “fix my broken JSON” features, while others slam the anti‑bot vibe and robots.txt concerns, sparking an ethics vs. efficiency showdown.

Key Points

Hottest takes

Save News