Tracking model performance on Artificial Analysis makes me think these models are constantly optimized/tuned in some way or another. GPT 5.5 was scoring in the mid 60's when it was first released, now it's almost 10 points higher.
Tracking model performance on Artificial Analysis makes me think these models are constantly optimized/tuned in some way or another. GPT 5.5 was scoring in the mid 60's when it was first released, now it's almost 10 points higher.