Hacker News

> Isn't concurrency also limited by your machines disk speed for writes

Yes, in theory: given a large enough database, and a disk that can only do one operation at a time, and a large enough operation that touches enough of the database. In practice, in a SQLite single tenant scenario? No, not at all.

> what difference does it make if you write sequentially vs concurrently. Why does concurrency even matter for databases?

As soon as your codebase involves reacting to events independently of a user taking action it becomes a practical concern. Generally, this is a broad question and has 1,000,000 answers.

EDIT: Originally I had "I think you understand generally, no?" appended but realized that's not helpful at all, if you did, you wouldn't be asking.

Something that may help is imagining what'd happen if a DB wasn't thread safe / didn't allow multiple writers. Ex. in SQLite's case, it allows multiple write operations to take place but there's a one-at-a-time queue. If we didn't have databases that were able to execute multiple writes simultaneously, you'd need a separate database for each concurrent writer you expect, and you'd effectively have a global lock. Orderly scaling would be ~impossible unless you did something crazy like have a single server per user

sevenzero a day ago [ - ]

I guess I need to dive deeper into this as I do not understand the implications you gave me, but I appreciate the attempt. Generally I understand why concurrency is good in many cases, I just dont get why its important for database stuff too.

Edit: thanks for clarifying in the edit, makes a lot more sense.

strbean a day ago [ - ]

Imagine if every tweet had to go through a one-at-a-time queue before being persisted. There's about 6000 tweets per second, so you would have to be able to save them at <0.17ms per tweet or else you would become backlogged. If you are getting backlogged, you have to buffer those incoming tweets somewhere until they can be writted, and eventually that buffer gets full and you start losing tweets.

goobatrooba a day ago [ - ]

Maybe that too is a native question, but there's a large scale between single user and 6000 tweets per second - most of our apps will never reach anything approaching even one save a second. So where to draw the line? I do far have gone the sqlite route for my hobby apps as it's so easy to handle and doesn't require setting up two docker containers for a single app. Am I drawing myself in a corner in case my apps ever do become relevant?

refulgentis 20 hours ago [ - ]

Excellent question, and I spent so many years asking myself it, this over and over. You asking it made me realize I just...don't anymore. So allow me to blather a bit / free associate because I won't be sure why myself until I've written it out.

TL;DR: whatever works for you is the right decision. (which isn't helpful, I heard this so many times and as the recipient, I thought "That's nice. Now how do I choose what works for me?")

I finally had to use Postgres a couple years ago after a career of only SQLite - startup founder & iOS app developer using SQLite, turned Googler on Android, turned doing-my-own-thing.

In retrospect, I have made only one bad decision:

I went way out of my way to make SQLite work at my 2009-iOS-startup. It was a restaurant point of sale system, and to allow a networked system, one of the iOS devices would act as a server. This was a really cool trick, even an advantage in marketing that was appreciated by users. It meant the restaurant could continue to operate if the internet went down. But it eventually became clear owners loved having internet-based access too, ex. to do reporting/financial analysis over the data. And I kept contorting, instead of moving past my fear of getting into things I didn’t know, I instead did some like rudimentary thing over port forwarding. The bad decision here was riding one horse for so long and letting it affect the product, having a real server database would have allowed for a lot more features, think, first party gift cards, and a 100 others.

After leaving Google I needed server-side storage and fought and fought to avoid it. Then it turned out Postgres is easy and, just like SQLite, 99.999% of the time I don’t even know I’m using it.

In retrospect, there’s ~0 switching cost to these, particularly in age of LLMs. If you do need something more one day, it’ll be easy to do, and if you have to do it in a rush because you’re successful, you’re in Good Problem territory.

Hope that helped, after writing it out, dunno how convincing it is. Feel free to follow up, I appreciate the curiosity/framing because I had the same thought for so long.

password4321 10 hours ago [ - ]

Thank you for sharing a detailed anecdote from production; there's not many of those around here.

AlotOfReading 21 hours ago [ - ]

If we imagine 1 tweet = 1 transaction, that's only 6k tps. 6k tps is completely achievable, dare I say even pedestrian for an optimized database. And most systems are operating far below the scale of Twitter/X.

Scaevolus 21 hours ago [ - ]

Sqlite can quite easily do 5000+ insert+commits per second on typical NVMe drives.

Speed is rarely the constraint that makes it unsuitable for an application.

klabb3 18 hours ago [ - ]

Round trip time is actually much faster than Postgres, since there’s no need to touch the network. You can get massive single threaded throughput. In order to achieve comparable throughput in Postgres you need a large amount of concurrent connections, since each conn spends most of its time passing messages, deserializing etc (with a much larger total amount of overhead). There are a surprising amount of bottlenecks and misconfiguration that can tank performance of networked systems, particularly DBs.

Like you suggest, the reason for not picking SQLite is not reliability, speed, etc. Networked DBs allow decoupling between app and db servers, which have operationally different characteristics. But most importantly, you can have multiple apps access the same DB at the same time. Eg analytics, one off queries, any 3p app that interacts with your data directly.

sevenzero a day ago [ - ]

While I understand your point and like the explanation, I gotta make the joke that some Tweets should be lost