April 15, 2026
Typo wars & trie‑hard takes
Fast and Easy Levenshtein distance using a Trie
Old trick resurfaces, devs split on the 'best' typo fixer
TLDR: A classic 2011 trick speeds up typo fixes by using a word tree to avoid redoing work, even powering giant rhyme lookups. The comments split between praise, “use a different metric” camp (Sorensen–Dice, Jaro–Winkler), and a cheeky “it’s old but gold,” with one dev boasting a faster Go/radix spin.
A 2011 post on speeding up typo-fixing with a word tree (a "trie") is back, and the comments are on fire. The author uses edit distance (how many changes it takes to turn one word into another) and a trie to blast through a giant dictionary—even powering rhyme searches from a cheeky “sock‑drawer datacenter.” The vibe? Equal parts nostalgia and nitpicky showdown.
Some readers are delighted—“very cool and satisfying”—while others roll in with practical hot swaps. One dev says they ditched edit distance for Sorensen–Dice (a faster similarity score) after their Wikipedia plugin slowed to a crawl. Another says Jaro–Winkler (a name-matching favorite) was the winner for finding the right “John Smith.” In plain English: everyone agrees we need smart typo handling, but they disagree on the tool for the job.
Then there’s the comedy: a deadpan “2011” drops like a mic, reminding everyone this isn’t new—but it still slaps. Meanwhile, a Go hacker flexes a souped‑up version using a radix tree (think: even tighter word compression) claiming “faster, smaller, better.” The thread becomes a friendly turf war: trie purists vs. metric switchers vs. “it’s old, but gold.” If you love algorithms, or just enjoy devs lovingly roasting an antique, this is your popcorn moment. For the curious, here’s the edit distance explainer: Levenshtein distance.
Key Points
- •The article contrasts a naive dictionary scan using Levenshtein distance with a trie-based optimization for approximate matching.
- •A Python script computes Levenshtein distance from a target to each dictionary word and reports matches within a maximum cost.
- •An example run ('goober', max distance 1) shows a search time of about 4.56 seconds on /usr/share/dict/words.
- •The naive approach has an upper bound of O(number of words × (max word length)^2).
- •Using a trie collapses shared prefixes so dynamic programming rows can be reused, greatly reducing repeated work; this supports large-scale use on rhymebrain.com (2.6M words) without caching.