> gpt-oss-120b is amazing

Yup, I agree, easily best local model you can run today on local hardware, especially when reasoning_effort is set to "high", but "medium" does very well too.

I think people missed out on how great it was because a bunch of the runners botched their implementations at launch, and it wasn't until 2-3 weeks after launch that you could properly evaluate it, and once I could run the evaluations myself on my own tasks, it really became evident how much better it is.

If you haven't tried it yet, or you tried it very early after the release, do yourself a favor and try it again with updated runners.