There are lots of benchmarks to compare the absolute values of different models on the same scale (as opposed to vibes (my apologies for the shorthand), etc.).
There are lots of benchmarks to compare the absolute values of different models on the same scale (as opposed to vibes (my apologies for the shorthand), etc.).