> The damage is done. You cannot build a business critical function on top of American SOTA frontier model. Especially not with the current crew in charge.

The switching costs of changing LLM providers is as low as it gets. All the individuals and startups I know try different models all of the time, even down to the level of choosing which provider to use based on the task. Bigger companies move slower but only because they have lawyers and teams negotiating contracts, not because there is a technical reason that it's hard to switch.

Companies have dealt with supply chain unpredictability by having multiple providers and switching between them since forever. It's infinitely easier to switch LLM providers than it is to deal with physical supply chain uncertainty.

For real production I find the switching cost is not as trivial as you portray. Even going to a new model version in the same model family, say GPT-4o to GPT-5.2, a transition I just finished on a not too complicated application, requires extensive retesting and tweaking of prompts, guardrails and parameters.

I second this; even switching between minor versions of a model, you need to adjust prompts: the new model is better by implying a bunch of things that, when included in the prompt, will overdo that thing.

Assessing quality of output is often not trivial, either. Typically, problems that are solved by offloading something to an LLM are super subjective, and customers “feel” something is different is vulnerable.

We try to quantify output differences by many different similarity metrics. But a lot of energy goes into subjectively evaluating if something still works.

We’re talking about SOTA models like Fable, though.

If you’ve got a product where the budget allows for Fable level token costs, I doubt you wouldn’t have the budget to run your evals again on a cheaper model if Fable was unavailable. I mean it wouldn’t even take that much token volume to turn it into a money saving proposition to do the engineering work to switch to a cheaper model.

Fable is primarily used for human in the loop tasks like coding or office work, not in some backend app unless the company has money to burn and doesn’t care about anything other than using the best model available at the time.

Maybe OP meant switching in a coding harness way? Not an application using AI? I had similar issues like you in the latter case, but in the former it's trivial.

if you’re building on LLMs you gotta have an eval and prompt iteration pipeline, and you ought to be evaling every model release — your competitors will do this, and your users will want the latest and greatest (for frontier tasks) and the cheapest/fastest. So you should already be paying this cost anyways. i guess it depends on your team size and scale but not building this muscle seems like not having continuous delivery for regular code or even like not having tests and ci to merge to main.

SOTA models are typically used for interactive coding and other human in the loop work

> say GPT-4o to GPT-5.2, a transition I just finished on a not too complicated application

Neither of which is close to SOTA, because tasks like these are typically built on a cost conscious manner which tries to keep token costs in check.

I’m primarily responding to all of the commenters who are acting like nobody is going to use American SOTA models for anything because the government interfered with them for a couple weeks. It’s obviously not true, and I expect these models to be oversubscribed instead of avoided like some are claiming.

Vendor diversity is a longstanding risk management principle. For it to work you need to invest in it as you build, not when the rug is pulled.

Exactly!

Even if you won't be able to use some model tomorrow, you can still make money by using it today!

And in the age of limited compute, spiky workloads and constant outages, building a mechanism to fallback to a weaker model when your primary choice isn't available is smart anyway.

For many, that fallback mechanism is simply called Cursor - soon to be owned by Elon Musk. Which opens up a similar but slightly different can of worms...

Well, there are many alternatives to Cursor as well.

> The switching costs of changing LLM providers is as low as it gets

Not trivial, you would need to do lots of evals and prompt tuning when you switch models.

imagine what happens when you optimize your agent skills to the current model, and new model starts breaking. you would need to have versioning for your skills, serving different skills based on the model while you do A/B testing

> Not trivial, you would need to do lots of evals and prompt tuning when you switch models.

Couldn’t we just train smaller models to “translate” what the harness user wants to what the worker model expects? I mean, if models understand caveman, it seems like just a small stretch

It's not switching costs, but trust.

There's no congress. There's no policy (they've been making noises about not allowing AI regulation and now they're not-regulating it like a child paying with an on/off switch). The law is whatever Dear Leader's mood is today. It overrides any contract you sign with private companies, and they roll over and take it, because that's how oligarchies work.