The demand is being driven by inference though. I really don't think there will be much motivation.

The large models are incredibly inefficient. We'll be squeezing them down for generations.

Right, that's where the major push is right now. Not with shrinking down some code libraries.