Our data pipeline produces .duckdb files that our app downloads (it watches the asset in S3 and pulls when etag changes). Makes it easy to get BQ/Clickhouse like performance without running or paying for that infrastructure. Not perfect for all cases, but it handles a lot more than you would expect.

this is a great use-case for duckdb, but not sure how it maps to the use of this protocol?

Roughly how big are the datasets?

~30GB .duckdb file