Document poisoning in RAG systems: How attackers corrupt AI's sources

Three fake files fooled the AI—and the comments melted down

TLDR: A demo shows three planted documents can steer an AI helper into confidently reporting fake company numbers. Commenters split: some dismiss it as an insider-only failure of design, others warn real-world data and social media spam can poison models—fueling a loud call for provenance and tighter defenses.

An engineer slipped three fake “CFO-approved” docs into a local AI search-and-answer setup (called RAG: it retrieves files, then generates an answer), and the bot proudly announced a made-up revenue crash. The lab is fully reproducible with code, and the community went DEFCON-1. Skeptics like sidrag22 called it a nothingburger—if a bad actor already has write access and the bot shows no sources, “that’s just a flawed product.” Others shot back that this is exactly how it breaks in the wild: public databases, regulatory filings, and messy archives get polluted—then AIs swallow it whole. One commenter even flagged engagement-farm networks on X mass-posting “whitepaper-style” text to game future AI ingest, and yes, that set off conspiracy alarms.

Amid the chaos, pragmatic voices said every company’s doc pile is already a junk drawer—old truths, contradictions, and half-baked drafts. The real fix, they argue, is layered defenses: strong source scoring, quarantine loops, and forcing the bot to show receipts. Meanwhile, classicists waved the “nothing new here” flag: this is the same trick humans fall for—hand someone a convincing “CORRECTED” memo and watch them misreport. The drama crown? The idea that a few spicy keywords can shove real numbers out of context. AI, meet office politics, but automated.

Key Points

•A local RAG system was manipulated by injecting three fabricated documents into a ChromaDB knowledge base, leading to false financial answers.
•The legitimate Q4 2025 figures ($24.7M revenue, $6.5M profit) were displaced by fabricated values ($8.3M revenue, –47% YoY, layoffs, acquisition talks).
•The setup used LM Studio with Qwen2.5-7B-Instruct, all-MiniLM-L6-v2 embeddings via sentence-transformers, ChromaDB, and a custom Python pipeline.
•The attack leverages PoisonedRAG’s two conditions: poisoned documents must rank higher in retrieval and drive the LLM to generate the attacker’s answer.
•Reproducible code and commands are provided; success is defined over 20 runs at temperature 0.1 where only the fabricated figure appears.

Hottest takes

"Seems just like a flawed product at that point." — sidrag22

"needs many more dimensions with scoring to model true adversaries" — alan_sass

"This attack is not 'new', only the vector is new 'AI'." — altruios

March 12, 2026

Fake CFO memo, real chaos

Three fake files fooled the AI—and the comments melted down

Key Points

Hottest takes

March 12, 2026

Fake CFO memo, real chaos

Document poisoning in RAG systems: How attackers corrupt AI's sources

Three fake files fooled the AI—and the comments melted down

Key Points

Hottest takes

Save News