I agree.

Also for certain use cases there are constraints like embedded hardware systems with no internet access. These LLMs have to be trained to specialize for clearly defined use cases under hardware constraints.

Frontier LLMs also are rarely function in isolation instead are orchestrating a system of special units aka subsystems and agents.

While costs and effort are one thing, being able to downsize these monster LLMs through finetuning itself in the first place is extremly valuable.