Have your cake and decompress it too

Vortex drops “smaller, faster” claim; commenters yell speed over size and “seen it before”

TLDR: Vortex claims smaller files and much faster reads by stacking lightweight, random-access compressors instead of heavy zip tools. Comments split: veterans say speed trumps size, and others argue the idea echoes ORC, BtrBlocks, and OpenZL—so it’s clever, but not entirely new.

Vortex just bragged it can make data files 38% smaller and read them 10–25x faster than the old guard, all by chaining a bunch of lightweight tricks instead of slapping on heavy-duty compressors like ZSTD. Think: specialized “mini-compressors” for numbers and strings, stacked smartly so you can still jump to any value instantly. The crowd loved the bold claim but immediately split into camps. One veteran waved a caution flag: “scan rate is more important than size”, arguing speed wins every time even if your files aren’t the tiniest. Cue jokes about “decompressing a whole cake just for one bite” and cheers for random-access reads.

The real drama? A few folks shrugged, saying this is basically a remix of ideas from the academic BtrBlocks paper and older formats like ORC. Another chimed in that it looks a lot like OpenZL, which auto-builds a custom compressor for your data—translation: cool, but not brand new. Fans teased Parquet for relying on a big final squeeze that kills quick lookups, while skeptics tossed eye-rolls at “yet another format.” The hot take meter spiked: speed-first pragmatists vs. size-obsessed minimalists, with memes about cake, frosting, and who’s actually reinventing the wheel. Delicious data drama!

Key Points

•Vortex reports TPC-H SF10 files that are 38% smaller and decompress 10–25x faster than Parquet with ZSTD, without using general-purpose compression.
•The approach is to try multiple lightweight encodings and compose them per column, inspired by the BtrBlocks framework.
•Parquet employs per-page lightweight encodings followed by a general-purpose compressor (e.g., ZSTD, LZ4, Snappy) per column chunk.
•General-purpose compression reduces random access and hampers sparse lookups and late materialization by requiring full-page decompression.
•Parquet’s hard-coded encoding cascade and limited repertoire hinder extensibility; discussions are underway to add encodings like ALP, while BtrBlocks advocates recursive chaining of lightweight encodings.

Hottest takes

"scan rate is more important than size" — gopalv

"It was easier to beat Parquet's defaults" — gopalv

"Looks similar to OpenZL" — pella

March 1, 2026

Compression wars, serve the cake

Vortex drops “smaller, faster” claim; commenters yell speed over size and “seen it before”

TLDR: Vortex claims smaller files and much faster reads by stacking lightweight, random-access compressors instead of heavy zip tools. Comments split: veterans say speed trumps size, and others argue the idea echoes ORC, BtrBlocks, and OpenZL—so it’s clever, but not entirely new.

Key Points

Hottest takes

March 1, 2026

Compression wars, serve the cake

Have your cake and decompress it too

Vortex drops “smaller, faster” claim; commenters yell speed over size and “seen it before”

TLDR: Vortex claims smaller files and much faster reads by stacking lightweight, random-access compressors instead of heavy zip tools. Comments split: veterans say speed trumps size, and others argue the idea echoes ORC, BtrBlocks, and OpenZL—so it’s clever, but not entirely new.

Key Points

Hottest takes

Save News