In my experience, there's little difference between implementing individual functions between frontier models and SotA ~30B param models.

Once you have a coherent design (the hard part), you can feed it to a pretty small model and get basically the same quality.

They'll not one-shot, but they're faster and cheaper, so it still works out in your favor.

Plus you can do it locally...

I have a similar experience. However, when including code review, I think the GPT model is the most impressive