November 1, 2025
Clippy brought a calculator
word2vec-style vector arithmetic on docs embeddings
Math magic for docs sparks a comment cage match
TLDR: A doc-embedding experiment swapped “Supabase” for “Angular” and landed near Angular testing pages, scoring 0.75 similarity. Comments split: some dismiss it as GPT wrapper fluff, others propose search boosting and redirects, while skeptics report nonsense matches and ask if this is reliable.
Docs just got a dash of algebra: the author tried the old word2vec trick — think “King − Man + Woman = Queen” — but on full-document embeddings, swapping out “Supabase” for “Angular.” With custom task settings, the result leaned toward Angular’s testing pages, clocking a 0.75 similarity. Cue comment fireworks. aDyslecticCrow rolled their eyes at yet another “writing tool” that’s just a GPT wrapper, saying they’d rather stitch together rephrasings themselves. The pragmatists arrived fast: nostrebored pitched concrete wins like boosting search in the “direction” a reader is headed, which sounded like personalization without the creepy. Then the thread turned into a mini science fair when jdthedisciple ran the classic arithmetic across multiple models and posted numbers, instantly spawning leaderboard vibes and debate over what those distances even mean. Meanwhile thornton crashed the party with real-world scars: their doc2vec+c osine project produced “totally nonsensical” redirects, sparking a fight over whether this is magic, math cosplay, or just misconfiguration. The memes? Folks joked it’s Clippy with a calculator and “Google for vibes.” The mood: split between cool demo energy and show-me-the-results skepticism — exactly the kind of drama that keeps the comments spicy.
Key Points
- •The article tests word2vec-style vector arithmetic on document embeddings using EmbeddingGemma.
- •Two setups are defined: (1) shift a Supabase testing doc toward Angular by subtracting 'supabase' and adding 'angular'; (2) shift topic within Supabase by subtracting 'testing' and adding 'vectors'.
- •Experiments are run with default and customized task types, acknowledging their impact on embedding behavior.
- •Verification uses cosine similarity against embeddings of selected short docs from Angular, CockroachDB, Skylib, Playwright, and Supabase, constrained by a 2048-token input limit.
- •With customized task types, the first experiment’s resultant vector aligns best with Angular testing docs, including 'Testing' (Angular) at a similarity of 0.75.