I've been having really good results with DeepSeek-v4-flash, qwen-3.6-moe, and the older gimini-3-flash-preview. (recent geminis suck hard)

Small models are more than enough for the majority of tasks these days. Plan and review with the bigger ones, let the little ones explore and implement.

OpenCode Go is $10/month for the open weight models with nice quotas: https://opencode.ai/go

You don’t have to limit yourself to the tiny models with the OpenCode Go plan, you can get a lot of usage from the bigger models if you keep the cache hot.

I am about 85% through my quota with 9 days left before refresh and have just used over 1B tokens, mostly DeepSeek V4 Pro, but also a little mimo 2.5 pro and kimi k2.6

For sure, I've been flipping between flash/pro (or the equivalent for other families), been trying to stick to one family per project as a way to test them out independently over longer periods and more realistic/diverse tasks. I've definitely spent more quota on pro and pushed more tokens through flash.