> Which ones are you looking at? Since the benchmark comparison in the blogpost itself doesn't include Opus at all.
I manually compared it with the values from the benchmarks they published when they originally announced the Claude 3 model family[0].
Not all rows have a 1:1 row in the current benchmarks, but I think it paints a good enough picture.