February 23, 2026
Bots Gone Wild
Facebook's Fascination with My Robots.txt
Facebook keeps pinging one guy’s robots.txt 7,700 times an hour and commenters cry “broken bot”
TLDR: A small server says Facebook’s crawler hammered its robots.txt file 7,700 times an hour without touching anything else. Commenters split between “broken bot” jokes, corporate apathy rants, and DDoS fears, while a few propose fixes like caching headers to make the crawler chill.
Facebook’s crawler just went full goldfish brain on a tiny self‑hosted code site, hitting the same little file—robots.txt—several times per second and nothing else. That file is the site’s “house rules” for bots, and the requests are confirmed from Meta’s own IPs using the “facebookexternalhit” user agent (docs). The poster says it’s around 7,700 hits per hour, and the community didn’t hold back.
Cue the jokes: one commenter quipped Facebook decided to ignore every other site’s rules and “make up the average” by endlessly refreshing this one robots.txt. Cynics arrived with popcorn: veterans like Nextgrid said big companies run in a “degraded, somewhat broken mode,” dropping the term “error budget” like a mic. Conspiracy flair? Tananaev wondered if this is a stealth DDoS (overwhelming a site with traffic) to force an error, then barrel through the rest of the site. Meanwhile, practical folks suggested adding cache headers so the bot stops reloading like it’s stuck on F5.
The vibe: half “Meta broke a loop,” half “corporate shrug.” No one believes the site suddenly went viral; it’s more “Facebook fell in love with a very boring file.” After recent AI bot swarms elsewhere, this feels tame—but the crowd’s roasting is deluxe, and everyone’s watching to see if the bot speeds up or chills out.
Key Points
- •A self-hosted Forgejo instance is receiving repeated requests to /robots.txt from facebookexternalhit/1.1.
- •The requests occur several times per second and have persisted for at least four days.
- •Traffic originates from Meta IP ranges, confirming the source of the crawler.
- •The crawler is not accessing any other paths on the site, only robots.txt.
- •Facebook’s documentation states FacebookExternalHit crawls shared links to gather metadata like titles, descriptions, and thumbnails.