I think Qwen 3.7-Plus is better at reasoning than Mythos, and I've used both for quite a while.

Would love to see samples of the kinds of prompts you use with both. I sometimes wonder if the specific wording is the secret sauce, I have very few issues with Opus / Claude, but when I try premier GPT models, I get weird output from what I've grown to expect with Claude.