I would imagine it will be a fundamental breakthrough, not weights alone, that are going to usher in the next generation of AI. Perhaps China will in fact make that breakthrough. They certainly seem to have a lot of eyeballs in the field right now.
I think they are already massively winning on efficiency... which is about to matter a lot as the frontier models jack up their prices in order to some day see a profit (and no, Anthropic getting massively subsidized by Elon out of spite doesn't count for long term profits).
There has really been one break-through, the actual construction of giant LLMs from the available titanic corpus of text. Even that barely involved much conceptual breakthrough, a few things maybe e.g. transformer. Basically it was a question of the accessibility of a) giant internet corpus of actual people actually saying stuff and b) adequate computing power. The witty surface training, the scaffolding for a chatbot is what made a universal stir. With this, though, we are done with revolutionary breakthroughs. Training for coding involves actual alteration of weights - and as it improves the general utility of the corresponding models will fail. In the end it will be a domain of specialized models. The improvement of this aspect via RLVR etc is what caused a general mania in the programmer milieu.
There is a lot of money in pretending that we are seeing unending revolutions.
I would imagine it will be a fundamental breakthrough, not weights alone, that are going to usher in the next generation of AI. Perhaps China will in fact make that breakthrough. They certainly seem to have a lot of eyeballs in the field right now.
I think they are already massively winning on efficiency... which is about to matter a lot as the frontier models jack up their prices in order to some day see a profit (and no, Anthropic getting massively subsidized by Elon out of spite doesn't count for long term profits).
There has really been one break-through, the actual construction of giant LLMs from the available titanic corpus of text. Even that barely involved much conceptual breakthrough, a few things maybe e.g. transformer. Basically it was a question of the accessibility of a) giant internet corpus of actual people actually saying stuff and b) adequate computing power. The witty surface training, the scaffolding for a chatbot is what made a universal stir. With this, though, we are done with revolutionary breakthroughs. Training for coding involves actual alteration of weights - and as it improves the general utility of the corresponding models will fail. In the end it will be a domain of specialized models. The improvement of this aspect via RLVR etc is what caused a general mania in the programmer milieu.
There is a lot of money in pretending that we are seeing unending revolutions.