Your experience with DeepSeek v4 Flash differs from mine: while I usually use DeepSeek v4 Pro (that is also inexpensive), I find using DeepSeek v4 Flash with the Fireworks.ai API and properly configured OpenCode to be very good for routine work, and it is pleasantly very fast. Admittedly I use DeepSeek v4 Pro for difficult problems.
I encourage people to at least once a month to do a quick evaluation with their own problems and workflows. Estimate cost as both what inference tokens cost for a task and also how much human effort it takes to get required results.
I disregard benchmarks.
We are also using fireworks as our model provider. Our harness is openClaw, so tasks are not only coding but all kinds of tasks. For instance, I asked to fetch some info from the web via Chrome browser and to collect the info in an MD. The MD never appeared, even though it claimed to. I asked it three times to write the MD and it was always: “oh yes, I do it now..” then nothing. The search itself also was very bad because it just gave up after one page and hallucinated an answer and - even worse :-) - told me it was very thorough…
Pro aced the task :-)
But maybe its a config issue.
Have to ask: did you try 'xhigh' thinking effort with Flash? I also found it nearly unusable on just 'high', but on 'xhigh' it's nearly equivalent to Pro's 'high'.
That sounds correct: Pro for longer agentic tasks, Flash is fine for writing short programs, finding things for me in a large code base, etc.