They seem… much better than all the models they compared against? What’s the catch?
They only showed the benchmarks where they outperformed?
It's twice the size?
They only showed the benchmarks where they outperformed?
It's twice the size?