As others have mentioned you're ignoring the long tail of open-weights models which can be self hosted. As long as that quasi-open-source competition keeps up the pace, it will put a cap on how expensive the frontier models can get before people have to switch to self-hosting.
That's a big if, though. I wish Meta were still releasing top of the line, expensively produced open-weights models. Or if Anthropic, Google, or X would release an open mini version.
Well, Google does release mini open versions of their models. https://deepmind.google/models/gemma/gemma-4/
And they're incredibly good for their size.
Which, unfortunately is still slow unusable garbage compared to fronteir models.
Not at all, it's more than enough for a large range of tasks. As for slow, that's just a function of how much compute you throw at it, which you actually control unlike with closed weights models.
Depends on your hardware.