Yeah. It’s not the end of the world.
But, it is a big own goal, because once you invest in building evals for your internal use-case, 1) it’s easier to switch your model to whatever is cheapest, and 2) it’s way easier to fine-tune an oss model.
Evals are annoying to build and most companies were fine to rest on vibes. Now many companies have to do the work for insurance.