This is awesome! With all of the projects/teams working on improving sqlite, it feels like it's just a matter of time before it becomes a better default than postgres for serious projects.

I do wonder - for projects that do ultimately enforce single writer sqlite setups - it still feels to me as if it would always be better to keep the sqlite db local (and then rsync/stream backups to whatever S3 storage one prefers).

The nut I've yet to see anyone crack on such setup is to figure out a way to achieve zero downtime deploys. For instance, adding a persistent disk to VMs on Render prevents zero downtime deploys (see https://render.com/docs/disks#disk-limitations-and-considera...) which is a real unfortunate side effect. I understand that the reason for this is because a VM instance is attached to the volume and needs to be swapped with the new version of said instance...

There are so many applications where merely scaling up a single VM as your product grows simplifies devops / product maintenance so much that it's a very compelling choice vs managing a cluster/separate db server. But getting forced downtime between releases to achieve that isn't acceptable in a lot of cases.

Not sure if it's truly a cheaply solvable problem. One potential option is to use a tool like turbolite as a parallel data store and, only during deployments, use it to keep the application running for the 10 to 60 seconds during a release swap. During this time, writes to the db are slower than usual but entirely online. And then, when your new release is live, it can sync the difference of data written to s3 back to the local db. In this way, during regular operation, we get the performance of local IO and fallback onto s3 backed sqlite during upgrades for persistent uptime.

Sounds like a fraught thing to build. But man it really is hard/impossible to beat the speed of local reads!

Yeah I mostly agree, and another comment brought up this idea kind of. If you can keep SQLite local, that’s usually the better answer. Local reads are insanely hard to beat, and Litestream is the canonical “keep SQLite local and ship durability elsewhere” model.

That’s actually a big part of why I started another project, haqlite: https://github.com/russellromney/haqlite which is much more on the “keep SQLite local” side: leader election via S3 conditional PUTs, WAL replication to S3, follower catchup, write forwarding to the leader, and graceful leader handoff. So it can get pretty close to zero-downtime deploys with 2+ nodes.

But not true zero downtime on a single server — if there isn’t already another warm follower alive, there is nowhere for leadership to go. So in my head:

- Litestream: local SQLite is primary; object storage is for durability / replication / failover

- haqlite: kind of a Litestream but with a focus on embedding it into any application.

- turbolite: object storage is the actual backing store, and the question is how plausible cold reads can get. No HA ideas

Keeping the DB local cuts the worst latency spikes, but then you trade away the whole pitch of ephemeral compute and just-in-time scaling, so you end up glued to old-school infra patterns in disguise, plus node affinity and warm-cache babysitting that look a lot like the stuff SQLite was supposed to let you dodge. Add a few readers on volatile nodes and it get ugly fast.

The tradeoff is real, but it's workload-specific. Turbolite is for read-mostly analytics on truly ephemeral compute -- Lambda or spot instances hitting a dataset that refreshes once a day. There's no persistent node to pin, so local-with-replication doesn't apply. You're comparing apples to a different fruit entirely.

What makes 250ms achievable isn't just range requests. It's the B-tree-aware page grouping. Standard S3 VFS pays 15-20ms per random page fetch. Group the pages a query actually touches into contiguous segments and you turn dozens of round trips into a few sequential reads. For cold JOIN queries on small-to-medium tables, that's the difference between 4 seconds and usable.

For OLTP or anything with meaningful write throughput, local WAL replication (LiteFS, haqlite, Litestream) is clearly right. These are two different problems that happen to both use SQLite.

No. Turbolite is explicitly read-write, and it’s genuinely hard to balance availability against local query speed. Those tradeoffs are real. I don’t think your response is describing this project very accurately.

Couldn't have said it better myself.

This is LLM-speak. Am I the only one who notices?

the comment history speaks for itself

[deleted]