2% of ICML papers desk rejected because the authors used LLM in their reviews

Secret PDF trap snares AI copy‑paste reviews; commenters yell ‘tip of the iceberg’

TLDR: ICML rejected 497 papers after hidden cues in PDFs exposed reviewers who promised no AI but used chatbots anyway. Commenters applauded the sting and mocked copy‑paste offenders, while others warned many cheaters slipped through and pushed for tougher, clearer rules to protect research quality.

The AI world’s messiest group project just imploded: ICML, a top research conference, tossed 497 submissions (about 2%) after catching “reciprocal reviewers” using AI to write their reviews—despite choosing the no‑AI option. The twist? A stealthy “watermark” in the paper PDFs fed hidden instructions that made chatbots print two odd phrases in copy‑pasted reviews. No sketchy AI detectors here—humans verified the hits, and the ICML post insists only obvious copy‑paste offenders were nailed.

Commenters are living for the sting. One called the watermark breakdown “worth the read” while others roasted the “Ctrl+C, Ctrl+LLM” crowd: “I’m amazed a simple trap worked this well,” snarked one user, picturing reviewers tripping over invisible ink. There’s drama too. A chorus applauds: you picked Policy A (no AI), you play by the rules—period. Another camp throws gasoline on the fire: if this only catches full copy‑paste, how many slicker AI‑assisted reviews slid by? Cue hot takes claiming “30–40% didn’t get caught.”

Memes flew fast: “ICML just Rickrolled the reviewers,” “PDF playing 4D chess,” and “AI wrote my review… and my rejection.” Beneath the jokes, a real split simmers: zero‑tolerance enforcement versus a messy new normal where some want AI help—and others want the door slammed shut.

Key Points

  • ICML 2026 desk-rejected 497 submissions (~2% of all) due to LLM-use violations by designated Reciprocal Reviewers under Policy A.
  • The conference adopted two reviewer policies: Policy A (no LLM use) and Policy B (LLMs allowed for understanding/polishing).
  • 506 reviewers assigned to Policy A were detected using LLMs; 795 reviews (~1% of all) were affected.
  • Generic AI-text detectors were not used; each flagged instance was manually verified by humans.
  • 51 Policy A reviewers with over half of their reviews LLM-generated were removed; ICML acknowledged and is managing resulting disruptions.

Hottest takes

"Worth reading for the discussion of the LLM watermark technique alone" — michaelbuckbee
"It only detects those who quite literally copied and pasted the LLM output" — hodgehog11
"Another 30-40% just didn't get caught" — coldtea
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.