Full-Text Search with DuckDB

DuckDB learns to search like a pro, and the crowd is already arguing about what’s missing

TLDR: DuckDB’s search add-on can now comb through huge amounts of writing more intelligently, which could make it a surprisingly handy one-stop tool for digging through archives. The community reaction is equal parts hype and side-eye: people love the convenience, but they’re loudly roasting what it still can’t do.

DuckDB, the fast-growing data tool with a loyal fan club, just got a fresh spotlight for its full-text search feature — basically, a way to search huge piles of text more intelligently than a simple word match. The article walks through how it can sift through things like giant email archives, handle word variations, ignore filler words, and rank results by relevance. For many readers, that was enough to trigger the classic tech-comment-section split: “This is amazing, I can do serious search without spinning up a giant system” versus “Cool demo, but don’t pretend this replaces the big leagues.”

And wow, the missing features became the real soap opera. The loudest grumbling was about the lack of match highlighting — in plain English, DuckDB can find the result, but it doesn’t clearly show you where the searched word appears. Commenters treated this like a personal betrayal, with jokes about doing “digital archaeology” in terminal windows and memorizing cursed keyboard shortcuts just to find the actual text. Others zoomed out and called the feature set a promising starter pack, not a finished search empire, especially compared with heavyweight options like Elasticsearch or PostgreSQL.

The funniest reactions? Plenty of duck puns, a mini-meltdown over irregular words like mouse versus mice, and the recurring meme that every useful new DuckDB feature inspires the same dangerous thought: “Wait, do I need fewer tools now?” That, more than anything, is what has people both excited and weirdly defensive.

Key Points

  • The article focuses on DuckDB’s current full-text search capabilities as a follow-up to an earlier post about DuckDB.
  • DuckDB’s FTS extension supports features including stemming, stop-word removal, accent stripping, and Okapi BM25 scoring with tunable parameters.
  • The article says DuckDB’s FTS feature set is a starting point and currently lacks some advanced capabilities found in more mature search systems.
  • A specific limitation noted is the lack of built-in highlighting of matched query terms, unlike PostgreSQL’s `ts_headline`.
  • The article explains that DuckDB FTS must be enabled with the extension commands `INSTALL fts;` and `LOAD fts;`, and references Snowball-based stemming behavior through a Python example.

Hottest takes

"Cool, but where’s the part that shows me the match without playing terminal bingo?" — data_drifter
"Every DuckDB post is someone quietly deleting three other tools" — bytebandit
"If mouse and mice still fight, the search drama is far from over" — sqlgremlin
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.