> What is notable is that a large amount of the total improvement of models has been in the coding benchmark. The coding index has gone from 15 months behind to only a month or two behind

This makes sense, right? Coding is one of the most obvious short-term uses of models, it also has a readymade market willing to pay a lot for tokens, it has a huge corpus to work with, and a significant degree of validation is built into the problem domain...