How Do You Find an Illegal Image Without Looking at It?

Internet hide-and-seek: bots hunt abuse images while admins melt down

TLDR: The piece explains how platforms can detect illegal images and videos using digital fingerprints without viewing them, just as AI-generated content explodes and swamps investigators. Commenters split between panicked self-hosters urging immediate scanning, big-picture soul‑searching about society’s failures, and frustration that many platforms still aren’t using free tools.

The article claims platforms can spot illegal images without looking at them using "digital fingerprints"—quick math summaries that match known criminal photos and videos—plus tools that scan millions of files while sparing human eyes. It ends with a gut punch: AI is now generating new material, some using real kids’ likenesses. That set the comments ablaze.

Self-hosting nerds sounded the alarm first. One Matrix admin warned that if you federate (let servers talk to each other), your cache will catch this stuff before you ever know it. Their fix? Kill image preloading and caching, like, yesterday. Meanwhile, the piece’s stark structure (“no X. no Y. just Z”) triggered a meta-backlash, with one reader rolling their eyes at “AI slop writing,” turning a deeply serious topic into a style war.

Then came the existential dread. A top comment asked what it says about “us” that the problem keeps growing, even as tools like PDQ for images and TMK for videos promise to find matches without showing moderators anything. Another commenter dropped the stat bomb: over 1.5 million reports now involve generative AI, which means fake-but-harmful content is flooding the same pipelines as real cases, overwhelming investigators. Cue the clash: practical sysadmins yelling “scan now” versus moral philosophers asking “why is this even happening,” with a fringe calling for extreme punishments. If you wanted light weekend reading, this wasn’t it—though a few tried gallows humor, dubbing it the internet’s “darkest game of Where’s Waldo.” For platforms not scanning? The article’s subtext is a siren: the tools are free—what’s your excuse link.

Key Points

•The article reports a pipeline handling 61.8 million files in a year, underscoring the scale of scanning for illegal media.
•PDQ is presented as an image perceptual hashing method that represents images as compact fingerprints (“32 numbers”) to enable matching without viewing content.
•TMK is introduced for video, capturing both overall appearance and temporal sequence, with noted limitations for short or clipped segments.
•When content is novel and not in hash databases, systems must move from matching to judging, using additional approaches to assess legality.
•Operationally, most effort is spent proving non-matches, a three-step workflow allows scanning without human viewing, many platforms may not be scanning, and free tools exist; AI-generated abusive media increases complexity.

Hottest takes

"You WILL get a CSAM spam issue" — mystraline

"What does it say about us" — disillusioned

"Over 1.5 million... involved generative AI" — therobots927

April 9, 2026

Darkest game of tag

Internet hide-and-seek: bots hunt abuse images while admins melt down

Key Points

Hottest takes

April 9, 2026

Darkest game of tag

How Do You Find an Illegal Image Without Looking at It?

Internet hide-and-seek: bots hunt abuse images while admins melt down

Key Points

Hottest takes

Save News