have you tried their flash model? pro was too slow for me too but I've found flash to be more than capable and it's faster than Gpt-5.5 at medium.

Actually on my list this week to take a look at putting an intelligence escalation flow MVP together (initial assumption would be that flash is good for 60-80% of my user's workflows, with only the tricky questions needing a more capable model. Whether I can put together a proper detection system is yet to be seen).

biggest issue I've had with flash is that it seems to hit a sort of "dumb o'clock" wall. right around the time Beijing would be going to work, response quality takes a dump on instruction-heavy tasks when context grows beyond ~120k tokens.

responses are still usable, no hallucinations or anything, but it's worth keeping in mind if you rely on detailed instructions or large context windows.