November 16, 2025
Spam Jam: Go vs Perl nostalgia
Show HN: Spam classifier in Go using Naive Bayes
Go dev drops a spam detector, community yells: “License, please”
TLDR: A Go-based spam detector using simple word-counting math landed on Hacker News. The thread quickly pivoted to a license showdown, with a Paul Graham throwback and a Perl veteran chiming in—community loves the retro vibe but won’t touch it without a clear open-source license.
A developer just rolled out nspammer, a simple spam detector in Go that uses Naive Bayes—think “count the words, make a smart guess”—plus a tiny cushion called Laplace smoothing so new words don’t break it. It’s got real email dataset tests, a plug‑and‑play API, and a demo that screams “buy now” equals spam. But the post instantly turned into a nostalgia-and-drama cocktail.
First up, a commenter drops the classic Paul Graham essay like a mic on stage, summoning the godfather of spam filtering and setting the tone: old school rules still apply. Then cipherself strides in with a flex: they built the same thing in Perl “12 (13?) years ago,” reminiscing about log math tricks and a wish-list for vectorization. Translation: this is solid, but we’ve seen this movie before. And just when the code talk warms up, leetrout slams the brakes with a community wake-up call: where’s the license? Without it, can anyone use this at all?
Cue jokes about “ham vs spam” and people teasing that every message containing “buy” is doomed. The vibe? A wholesome throwback project, showered in classic references, but the loudest chorus is open source needs a license. Old-school wisdom, new-school Go, and a dash of drama—just how HN likes it.
Key Points
- •A Naive Bayes spam classifier named nspammer is implemented in Go.
- •It uses Laplace smoothing (default α=1.0) to handle unseen words and avoid zero probabilities.
- •The API provides `NewSpamClassifier` for training and `Classify` for determining spam vs. non-spam.
- •Classification uses log probabilities to prevent numerical underflow and compares class scores.
- •The project supports the Kaggle Spam Mails Dataset via `./init.sh` and includes tests with accuracy evaluations.