Decent article, but some remarks:

1) It seems these two statements conflict with each other:

> The oldest such transaction sets the cutoff—referred to as the "MVCC horizon." Until that transaction completes, every dead tuple newer than its snapshot is retained.

and

> For example, imagine three analytics queries, each running for 40 seconds, staggered 20 seconds apart. No individual query would trigger a timeout for running too long. But because one is always active, the horizon never advances, and the effect on vacuum is the same as one transaction that never ends.

If the three analytics *transactions* (it's transactions that matter, not queries, although there is some subtlety around deferred transactions not acquiring a snapshot until the first query) are started at different times, they will have staggered snapshots and so once the first completes, this should allow the vacuum to advance.

2) Although the problem about this query:

    SELECT * FROM jobs
    WHERE status = 'pending'
    ORDER BY run_at
    LIMIT 1
    FOR UPDATE SKIP LOCKED;
having to consider dead tuples is a genuine concern and performance problem, this can also be mitigated by adding a monotonically increasing column and adding a `WHERE column < ?` clause, provided you have also added an index to make that pagination efficient. This way you don't need to consider dead tuples and they 'only' waste space whilst waiting to be vacuumed, rather than also bogging down read perf.

There is a little subtlety around how you guarantee that the column is monotonically increasing, given concurrent writers, but the answer to that depends on what tricks you can fit into your application.

3) I almost want to say that the one-line summary is 'Don't combine (very) long-running transactions with (very) high transaction rates in Postgres'

(Is this a fair representation?)

Yo! Author here, I’ll be around if anyone’s got questions!

If I understood correctly, the queue implementation in the blog post holds a transaction while an operation is progress.

I see the advice to make it as short as possible, but why can’t we update the status column to, say, “processing” and avoid potentially long transactions at all?

Did you test with fillfactor < 100 on the queue table? With HOT updates, status changes can reuse dead space without creating new index entries, which seems like it could significantly delay the onset of the death spiral?

If the status column changes, and an index depends on the contents of that status column (be it by referencing it in its columns, or in the index's WHERE filter) then an update of the status column will prevent the HOT optimization from being applied.

[dead]

Postgres can do so much. I see people choose Kafka and SQS for things that Graphile Worker could do all day long.

“Use Postgres for everything” is a great philosophy at low/medium scale to keep things simple, but there comes a scaling point where I want my SQL database doing as little possible.

It’s basically always the bottleneck/problem source in a lot of systems.

Yes. For example you'll typically have a "budget" of 1-10k writes/sec. And a single heavy join can essentially take you offline. Even relatively modest enterprises typically need to shift some query patterns to OLAP/nosql/redis/etc. before very long.

> I see people choose Kafka and SQS

SQS is dead simple, and if your in AWS (forever) it is "in the stack" with some easy to use features that may make sense to you (delay queue is a great one).

Kafka is... a lot. If you need what it provides, then it's great. You just have to be able to support it, and thats non trivial.

I can point to more than a hand full of Kafka project that exist because it was clear that someone wanted it on their resume. I dont think any one is doing that with SQS, it is just (a fairly good utility). However if you want to leave (or branch out from) AWS and you're reliant on it, good luck.

It would be nice if this ad at least explained a little bit of the technical side of the solution.

We’ve got a separate blog post all about how traffic control works: https://planetscale.com/blog/behind-the-scenes-how-traffic-c...

[deleted]

It sounds vaguely like InnoDB’s concurrency control solution which uses tokens [0] as a unit of maximum work a query can perform.

0: https://dev.mysql.com/doc/refman/8.4/en/innodb-performance-t...

[dead]