Models aren't just big bags of floats you imagine them to be. Those bags are there, but there's a whole layer of runtimes, caches, timers, load balancers, classifiers/sanitizers, etc. around them, all of which have tunable parameters that affect the user-perceptible output.
There really always is a man behind the curtain eh?
Often it's literally just that:
https://www.msn.com/en-us/money/other/ai-startup-backed-by-m...
It's still engineering. Even magic alien tech from outer space would end up with an interface layer to manage it :).
ETA: reminds me of biology, too. In life, it turns out the more simple some functional component looks like, the more stupidly overcomplicated it is if you look at it under microscope.