I completely agree. On initial takeaway I find 3.7 sonnet to still be the superior coding model. I'm suspicious now of how they decide these benchmarks...
I completely agree. On initial takeaway I find 3.7 sonnet to still be the superior coding model. I'm suspicious now of how they decide these benchmarks...