In a properly optimized database absolute majority of queries will hit indices and most data will be in memory cache, so majority of transactions will be CPU or RAM bound. So increasing number of concurrent transactions will reduce throughput. There will be few transactions waiting for I/O, but if majority of transactions are waiting for I/O, it's either horrifically inefficient database or very non-standard usage.

Your arguments make sense for concurrent queries (though high-latency storage like S3 is becoming increasingly popular, especially for analytic loads).

But transactions aren't processing queries all the time. Often the application will do processing between sending queries to the database. During that time a transaction is open, but doesn't do any work on the database server.

It is bad application architecture. Database work should be concentrated in minimal transactional units and connection should be released between these units. All data should be prepared before unit start and additional processing should take place after transaction ended. Using long transactions will cause locks, even deadlocks and generally should be avoided. That's my experience at least. Sometimes business transaction should be split into several database transaction.

Your database usage should not involve application-focused locks, MVCC will restart your transaction if needed to resolve concurrency.

If you aren't hitting IO (I don't mean HDDs) on a large fraction of queries you either skipped a cache in front of the DB or your data is very small or you spent too much on RAM and too little on your NVMe being not a bottleneck.