Frozen DuckLakes for Multi-User, Serverless Data Access

Community cheers a “Frozen” data lake you can share, skip servers, and time‑travel

TLDR: DuckDB’s Frozen DuckLake lets teams query shared cloud data without running a database server. Commenters cheered the “git for data” feel and time‑travel trick, while skeptics asked about updates—fans answered you can “virtually” update by adding new files, keeping things simple and cheap.

DuckDB just dropped a “Frozen DuckLake” — a read‑only snapshot of cloud files you can query like a database, minus the actual database. The thread went wild over the “no moving parts” vibe and zero servers to babysit. User gopalv likened it to Git’s simple HTTP hosting, praising how DuckDB acts like a slick client and library you can point at storage. Old‑guard veterans nodded at the nod to Apache Iceberg, saying this feels like the original dream: manifests, Parquet files, and freedom from the dreaded Hive metastore. Cue the memes: yes, people hummed “Do You Want to Build a Snowman?” while building data lakes.

But drama? Of course. Some side‑eyed the “frozen” part — what if you need edits? Enter ryanschneider’s mic drop: you can “virtually” update by adding new files, keep the old ones, and even do time travel (“go back one week”) without touching the originals. That soothed the skeptics and energized the “Git‑for‑data” crowd. Others simply loved the minimalism: store files in the cloud, share a DuckDB file, and everyone can query it over HTTP or S3. mjhay deadpanned that the Data Engineering world keeps inventing “lake” metaphors, but admitted this one nails simplicity. There’s even a tiny space‑missions demo on GitHub to try. TL;DR: less ops, more vibes — and a Frozen theme song on repeat

Key Points

•Frozen DuckLakes are read-only DuckLake snapshots published to cloud storage, eliminating the need for a catalog database server.
•They provide near-zero cost overhead beyond storing Parquet files and support public access via cloud/HTTP.
•Data remains in Parquet files across potentially multiple cloud environments, referenced by a DuckDB-formatted DuckLake file.
•Updates are performed by creating new Frozen DuckLake snapshots; older versions are accessible via retained revisions or time-travel queries.
•The workflow uses ducklake_add_data_files() to ingest Parquet metadata on a single machine, then publishes the DuckLake file to S3/HTTPS for multi-user read-only access.

Hottest takes

duckdb is so easy to use as a client with an embedded server — gopalv

you can "virtually" update them (the files themselves aren't touched, just new ones created) — ryanschneider

continued innovation in lake metaphors in the DE space — mjhay

October 30, 2025

Let It Go… Of Your Database Server

Community cheers a “Frozen” data lake you can share, skip servers, and time‑travel

Key Points

Hottest takes

October 30, 2025

Let It Go… Of Your Database Server

Frozen DuckLakes for Multi-User, Serverless Data Access

Community cheers a “Frozen” data lake you can share, skip servers, and time‑travel

Key Points

Hottest takes

Save News