OK, this is really neat: - S3 is really cheap static storage for files. - DuckDB is a database that uses S3 for its storage. - WASM lets you run binary (non-JS) code in your browser. - DuckDB-Wasm allows you to run a database in your browser.
Put all of that together, and you get a website that queries S3 with no backend at all. Amazing.
S3 might be relatively cheap for storing files, but with bandwidth you could easily be paying $230/mo. If you make it public facing & want to try to use their cloud reporting, metrics, etc. to prevent people for running up your bandwidth, your "really cheap" static hosting could easily cost you more than $500/mo.
R2 is S3 compatible with no egress fees.
Cloudflare actually has built in iceberg support for R2 buckets. It's quite nice.
Combine that with their pipelines it's a simple http request to ingest, then just point duckdb to the iceberg enabled R2 bucket to analyze.
> R2 is S3 compatible with no egress fees.
There's no egress data transfer fees, but you still pay for the GET request operations. Lots of little range requests can add up quick.
Can't believe that is what the industry has come down to. Kind like clipping coupon to get the best deal according different pricing overlords.
It is time like this that makes self-hosting a lot more attractive.
Luckily it's just static files. You can use whatever host you want.
For a demo of this (although not sure with duckdb wasm that it works with iceberg) https://andrewpwheeler.com/2025/06/29/using-duckdb-wasm-clou...
Was about to jump in to say the same thing. R2 is a much cheaper alternative to S3 that just works and I have used it with DuckDB, works smoothly
I think this approach makes sense for services with a small number of users relative to the data they are searching. That just isn't a good fit for a lot of hosted services. Think how much that TB's of data would cost on Algolia or similar services.
You have to store the data somehow anyway, and you have to retrieve some of it to service a query. If egress costs too much you could always change later to put the browser code on a server. Also it would presumably be possible to quantify the trade-off between processing the data client side and on the server.
Stick it behind Cloudflare and it should be effectively free.
Until it isn't.
S3 is doing quite a lot of sophisticated lifting to qualify as no backend at all.
But yeah - this is pretty neat. Easily seems like the future of static datasets should wind up in something like this. Just data, with some well chosen indices.
I believe all S3 has to do here is respond to HTTP Range queries, which are supported by almost every static server out there - Apache, Nginx etc should all support the same trick.
100%. I’m with y’all - this is what I would also call a “no-backend” solution and I’m all in on this type of approach for static data sets - this is the future, and could be served with a very simple web server.
I’m just bemused that we all refer to one of the larger, more sophisticated storage systems on the plant, composed of dozens of subsystems and thousands of servers as “no backend at all.” Kind of a “draw the rest of the owl”.
Still qualifies imo. Everything is static and on a CDN.
Lack of server/dynamic code qualifies as no backend.
Can you replace S3 with a directory and nginx and save lot of money?
Yes. Especially if you use Storage Combinators.
They let you easily abstract over storage.
https://2019.splashcon.org/details/splash-2019-Onward-papers...
Yes, i.i.r.c. it's not S3 specific just URLs
Or use R2 instead. It’s even easier.