No you don't; it's often overkill to use the SOTA models. People want SOTA because it's shiny, but there are a lot of tasks where it's cheaper and more efficient to use other models.

> but there are a lot of tasks where it's cheaper and more efficient to use other models.

Sure… but which ones? How can you know ahead of time?

I just did a “simple” upgrade project where both me and the AI kept tripping over dead code, subtle typos, and difficult-to-trace live versus dead code.

Many times I used “Medium” thinking I got bitten, but not every time, and I couldn’t predict when.

So “Extra high” it was, for the entire project.

Far fewer nasty surprises!

Right. You hire the developer when you want a developer. But if I am building simple agentic workflows -- glorified automations with a small bit of structured "thinking" - I will sure use the cheapest API that can deliver that task at the speed I want.

I wonder where the market sizes will shake out for these different types of use cases? I am guessing right now 1 is bigger than 2 but not for long (by token volume)?

For programmatic usage oftentimes SOTA isn't useful.

For example, I have software that summarizes articles and classifies links on webpages to build a synthetic RSS feed, both of which use LLMs, neither of which need a SOTA model.

I'll probably use LLMs to bootstrap a dataset of native ads in articles, and there again, I don't really need a SOTA model.

If it's for more open ended tasks like writing code though, I agree that at this point SOTA models make more sense to use.

In my experience: anything of open-ended complexity (software development, research, product design, ...) benefits from wathever the frontier can offer. 95% of Line of Business automation and workflows can be handled by even a reasonably small open weights generalist model flanked by a few even smaller specialized models. Yes, designing such a setup takes more knowledge and work dan just chucking it all over the api with prompts. But that is how I can run a system here for <$30/month vs >$1.000 month. As an added bonus, no model server can shut me down at the drop of a hat.

Exactly. I simply don't have the time to deal with non-SOTA model output.