Show HN: Self-host Reddit – 2.38B posts, works offline, yours forever

Reddit in a box: devs cheer, nostalgics beg Apollo, and Voat sparks a moral brawl

TLDR: A tool lets anyone self-host an offline Reddit archive of 2.38B posts (plus Voat/Ruqqus). Commenters celebrate preservation and Apollo nostalgia, clash over including Voat’s toxic history, and joke about training troll AIs—raising big questions about saving the internet’s past versus curating it.

Redd-Archiver just dropped a “Reddit-in-a-box” you can run yourself, turning 2.38 billion posts into your own offline museum. It even bundles archives from the shut-down Voat and Ruqqus. No JavaScript, mobile-friendly, and you can search if you spin up the server—think a forever offline Reddit library you own. There’s even AI integration so assistants can query posts, and yes, the data’s on a torrent section for the brave.

But the comments? Pure internet theater. Nostalgic users dream of hooking it to the beloved, now-dead Apollo app to recapture that lost vibe, while power users want a plugin to restore deleted or protest-bot-garbled comments so old threads make sense again. Archivists are applauding the “save everything” mission, but the moment Voat is mentioned, the mood flips: one commenter calls it “Reddit for neonazis,” igniting a heated ethics debate—preserve history vs platform toxicity. Meanwhile, jokesters say Hacker News will use the dump to train “reddit troll” AIs, turning the project into meme fuel. The vibe is a chaotic cocktail of digital preservation, Apollo nostalgia, and moral panic—exactly the kind of drama that makes the internet irresistible.

Key Points

  • Redd-Archiver v1.0 converts Reddit, Voat, and Ruqqus data dumps into browsable HTML archives for offline use or server-backed full-text search.
  • It supports 2.38B Reddit posts (40,029 subreddits through Dec 31, 2024), 3.81M Voat posts and 24.1M comments, and 500K Ruqqus posts, totaling 2.384B posts across 68,883 communities.
  • Full-text search requires Docker and uses PostgreSQL with GIN indexing; unified search spans all supported platforms.
  • The tool provides a REST API v1 with 30+ endpoints and an MCP server offering 29 AI tools auto-generated via OpenAPI.
  • Design emphasizes mobile-first, JavaScript-free UI, theme support, accessibility, Tor optimization, streaming processing, SEO features, and progress tracking.

Hottest takes

"Hacker News collectively grabs the dataset to train their models on how to become effective reddit trolls" — kylehotchkiss
"Gross. Why would anyone want to have an archive of Reddit For Neonazis?" — Jordan-117
"replaces deleted comments and those bot-overwritten comments with the original context" — Aurornis
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.