have you tried their flash model? pro was too slow for me too but I've found flash to be more than capable and it's faster than Gpt-5.5 at medium.
have you tried their flash model? pro was too slow for me too but I've found flash to be more than capable and it's faster than Gpt-5.5 at medium.
Actually on my list this week to take a look at putting an intelligence escalation flow MVP together (initial assumption would be that flash is good for 60-80% of my user's workflows, with only the tricky questions needing a more capable model. Whether I can put together a proper detection system is yet to be seen).
biggest issue I've had with flash is that it seems to hit a sort of "dumb o'clock" wall. right around the time Beijing would be going to work, response quality takes a dump on instruction-heavy tasks when context grows beyond ~120k tokens.
responses are still usable, no hallucinations or anything, but it's worth keeping in mind if you rely on detailed instructions or large context windows.