OpenData Vector: MIT-Licensed Vector Search on Object Storage

This new open search tool promises big savings, and commenters instantly asked if it can beat the cool kids

TLDR: OpenData Vector is a new open-source search engine that says it can search huge amounts of data cheaply by leaning on cloud storage instead of expensive always-on servers. Commenters were intrigued, but the big debate was whether it’s genuinely competitive with Turbopuffer or just borrowing the vibe.

A new project called OpenData Vector just entered the chat with a very specific promise: why pay a pricey company to run your search system when you could do it yourself, cheaper, using basic cloud storage? The pitch is almost cheeky in its simplicity. It says it can handle 100 million pieces of data for about $350 a month, stay reliable, and run anywhere that can connect to object storage like S3. In plain English: the software tries to turn cheap cloud storage into the main brain of the system, so the servers doing the work can stay disposable and easy to replace.

But the real fun is in the reaction. The first wave of commenters didn’t just clap politely — they immediately compared it to Turbopuffer, one of the buzziest names in this corner of tech. That set the tone fast: admiration, curiosity, and a tiny whiff of "okay, but can it actually keep up?" One commenter called it "very interesting" and then basically asked the question hanging over the whole launch: how close is this thing to Turbopuffer on speed, and where are the painful scaling cliffs?

That’s the community drama in a nutshell. People love the MIT license and the anti-vendor energy — nobody enjoys feeling trapped into expensive hosted tools — but they also smell a showdown coming. The vibe is part optimism, part benchmark hunger, with a side of startup soap opera: open-source underdog drops a bold cost claim, and the crowd instantly demands receipts.

Key Points

  • OpenData Vector is announced as an MIT-licensed vector search engine built on SlateDB.
  • The article says the system is stateless, durable, highly available, and runs anywhere with object storage access.
  • OpenData Vector is positioned as a middle ground between self-hosting pgvector and using managed vector database vendors.
  • The article states that object storage offers durability, lower storage cost, zero cross-AZ networking cost, and strong consistency.
  • The article describes an architectural evolution from tiered storage to disaggregated systems and then to stateless systems, placing OpenData Vector in the stateless generation.

Hottest takes

"Very interesting, thanks for sharing" — oliverio
"a lot of nods to Turbopuffer's architecture" — oliverio
"how ~close is OpenData Vector to Turbopuffer in terms of performance today" — oliverio
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.