The 10x claim reminds me of various scaling laws in anthropology/sociology, especially the much-debated Haire cube-root scaling law (eg https://gwern.net/doc/sociology/1959-haire.pdf https://gwern.net/doc/economics/1983-stephan.pdf ). I wonder if one could give a derivation from queuing theory where you have an argument along the lines of 'in indefinitely many layers of queues of queues, the total delay is minimized at 90% load per queue which yields a 10x scale-free slowdown'?

After some chatting with GPT-5.4 Pro, I think the 10x claim might simply be because in the >90% load region, you get the famous unbounded delays. And with realistic loss functions like absolute or squared error (consider that in a queue of queues, delays would add) and distributions like Poisson, you'd encounter massively increased losses if you pushed too far towards 100% load. Without extremely detailed accurate statistics of the sort almost no one has, it'd be hard to distinguish 'by eye' 90% from 95%, say, and you'd just see '10x' everywhere you looked.

(And in a pyramid of queues like many layers of reviews, each layer will wind up being about equally loaded, because otherwise you would get a big benefit from adding/removing capacity, so each layer will slowly be optimized towards its breaking point, yielding the 10x everywhere.)