Why would anyone think that it would get linearly worse? What's the (wrong) assumption there?

I think people are reading it as a request every 0.8s that takes 1s to process, instead of 0.8 requests per second.

I'm mostly just surprised the graph starts at 5 seconds for a mean value for all datapoints. I would have assumed it starts much closer to 1s. Which just makes the poll responses even crazier. Who is picking B when you have 25% more capacity than you need?

But I suppose the question is underspecified. How does the load balancer know which systems are busy? What happens to a request if the load balancer routes a request to a busy server?

I thought the same thing. But, should we be surprised about what people believe in these days?

I think that the issue is in part due to the variables. Plotting the mean request time is less intuitive than plotting throughput.

If you plot throughput vs number of servers, it'll be a straight line. And asking people that, I think most would agree on a straight line. But who knows!

One explanation would be that more load could mean higher (absolute) variance in queue length, and therefore higher latency especially at higher percentiles. It doesn't work out that way (for reasons that Erlang actually writes about in one of his original works), but it's not an entirely unreasonable intuition.

I think author made it up just to have something more to show up on graph.

It was a poll on Twitter, do you really expect good responses?