It's interesting. 95% of time you don't need the extra 5% rigor that frontier models provide to you compared to the 10-100x cheaper Chinese equivalents.

The remaining 5% of time you get a big boost for your high-reasoning problem solving needs and evade a lot of pain. Now, I just need to be able to predict accurately when I need this extra 5% and when not :)

the extra 5% time you will need to help AI with multiple turns and information it needed. These 5% time reasoning rarely is enough to finish the task. i.e. 5% time AI is just not enough to complete the task without a lot help.

I find the trick I use is to get the model to come up with a phased plan, and review it. If I spot anything that seems dumb, I give direction on the way it should be done. And once you finalize that, the model can run through the steps fairly reliably. As long as you're intentionally making all the big decisions, things tend to work out well.