That might be the case, but inference times have only gone up since GPT-3 (GPT-5 is regularly 20+ seconds for me).

And by GPT-5 you mean through their API? Directly through Azure OpenAI services? or are you talking about ChatGPT set to using GPT-5.

All of these alternatives means different things when you say it takes +20 seconds for a full response.

Sure, apologies. I mean ChatGPT UI