May 30, 2026
Compression? More like commentpression
Kore: Binary File Format Optimized for Modern Data Systems (Open Source)
New data file tries to dethrone Parquet, but commenters want receipts
TLDR: Kore is a new open-source file format claiming faster reads and smaller storage for massive datasets, with Spark support and big launch-day bravado. Commenters weren’t ready to clap just yet: they want clear comparisons, real-world proof, and an explanation of why anyone should switch from Parquet.
A shiny new open-source project called Kore just strutted onto the big-data stage promising eye-popping speed, smaller files, and easy use with Spark, a popular tool for processing huge piles of information. The pitch is pure swagger: way faster than old-school text formats, tighter compression, and battle-tested before release. One booster even arrived like a hype man at a product launch, bragging about three years of production testing, support across a bunch of programming languages, and even a VS Code viewer for .kore files. In other words: this isn’t just a file, it’s a whole vibe.
But the comments? Oh, they immediately turned into a trust issues support group. The biggest reaction was basically: cool story, now show the benchmarks. Several readers zeroed in on the awkward compression claim, asking why Kore boasts a lower percentage than Parquet when, to normal humans, a bigger number sounds better. Others wanted to know the real tradeoff: what do you gain, what do you lose, and can this thing handle truly giant datasets without falling over? One commenter called the whole thing a little “vibe-codey,” which is internet-speak for “this looks slick, but would I trust it with my paycheck?” Another piled on by asking for comparisons to rivals like Vortex and wishing for a DuckDB extension.
So yes, Kore launched with bold promises — but the real headline from the crowd is prove it, compare it, and explain it like we’re not already sold.
Key Points
- •The article presents Kore v0.1.0 as an open-source binary file format optimized for analytical workloads.
- •It claims a 38% compression ratio versus 63% for Parquet and a 131x query speedup through column pruning and predicate pushdown.
- •The article says Kore achieved zero data loss verification across more than 400,000 tested cells.
- •A Rust API is provided with sample functions for writing, reading, reading individual columns, and retrieving file information.
- •The release includes new PySpark and Spark SQL integration, along with project publishing and testing guidance.