Can’t believe they needed this investigation to realize they need a connection pooler. It’s a fundamental component of every large-scale Postgres deployment, especially for serverless environments.

Pooling connections somewhere has been fundamental for several decades now.

Fun quick anecdote: a friend of mine worked at an EA subsidiary when Sim City (2013) was released, to great disaster as the online stuff failed under load. Got shifted over to the game a day after release to firefight their server stuff. He was responsible for the most dramatic initial improvement when he discovered the servers weren't using connection pooling, and instead were opening a new connection on almost every single query, using up all the connections on the back end DB. EA's approach had been "you're programmers, you could build the back end", not accepting games devs accurately telling them it was a distinct skill set.

No? It sounds like they rejected the need for a connection pooler and took an alternative approach. I imagine they were aware of connection poolers and just didn't add one until they had to.

can't believe postgres still uses a process-per-connection model that leads to endless problems like this one.

You can't process significantly many more queries than you've got CPU cores at the same time anyway.

Much of the time in a transaction can reasonably be non-db-cpu time, be it io wait or be it client CPU processing between queries. Note I'm not talking about transactions that run >10 seconds, just ones with the queries themselves technically quite cheap. At 10% db-CPU-usage, you get a 1 second transaction from just 100ms of CPU.

That many long-running transactions seem like a pretty unusual workload to me and potentially running into isolation issues. I can see running a few of these, but not a lot, especially at the same time?

A few (single digits, maybe low double digits) per database CPU core isn't necessarily too much, of course the pattern can be conflict-heavy, where it would be problematic, but this is not about absolute concurrency but about concurrency needed to saturate CPU.

In a properly optimized database absolute majority of queries will hit indices and most data will be in memory cache, so majority of transactions will be CPU or RAM bound. So increasing number of concurrent transactions will reduce throughput. There will be few transactions waiting for I/O, but if majority of transactions are waiting for I/O, it's either horrifically inefficient database or very non-standard usage.

Your arguments make sense for concurrent queries (though high-latency storage like S3 is becoming increasingly popular, especially for analytic loads).

But transactions aren't processing queries all the time. Often the application will do processing between sending queries to the database. During that time a transaction is open, but doesn't do any work on the database server.

It is bad application architecture. Database work should be concentrated in minimal transactional units and connection should be released between these units. All data should be prepared before unit start and additional processing should take place after transaction ended. Using long transactions will cause locks, even deadlocks and generally should be avoided. That's my experience at least. Sometimes business transaction should be split into several database transaction.

Your database usage should not involve application-focused locks, MVCC will restart your transaction if needed to resolve concurrency.

If you aren't hitting IO (I don't mean HDDs) on a large fraction of queries you either skipped a cache in front of the DB or your data is very small or you spent too much on RAM and too little on your NVMe being not a bottleneck.

A process that is blocked for io, whether network or disk, will get taken off the cpu and another process put on the cpu. It doesn’t just waste the cpu until the quanta is gone.

redis is single-threaded but handles lots of connections (i.e. > 500) with much better performance vs. postgres. there's zero chance someone building postgres in 2025 would do one process per connection, I don't think there's any argument that it's a good design for performance. it's just a long-ago design choice that would be difficult to change now.

I disagree. If that was the case, pgBouncer wouldn't need to exist.

The problem of resource usage for many connections is real.

It's about queueing work, not running all these queries at the same time. You can run pgbouncer or you can have a pool on your backend. Having more connections won't make it go faster, so that really seems like a low-priority thing for postgres to me. Even if you integrated pooling into postgres the overhead of auth would be still taking time for small queries anyway.

that's too simplistic.

there are many reasons to need something like pgbouncer.

1) not all workloads lend themselves to backend pools

2) not all backends can afford pooling

3) you don't always control your backend

4) you might have a lot of different backends connecting to the database. it's much simpler to just tell them to connect to pgbouncer.

5) you don't want to trust backend team to not bring postgresql down with many simultaneolus connections.

6) backend pools often misbehave

people don't wake up one day and just add another cog to their infrastructure randomly.

I was surprised too to need it in front of RDS (but not on vanilla, as you pointed out).

In serverless world for sure but in old-school architecture it's common to use persistent connections to a database which make connection pooler less essential. Also the last time I did check (many years ago admittedly) connection poolers didn't play well with server-size prepared statements and transactions.

pgbouncer added support for prepared statements a couple years back.