Open source models don't need to be anywhere near as good as Claude Mythos or even Claude Sonnet to 'win'.
Open source 'winning' just means that there exists at least one open source alternative to closed models which is as good as, say, GPT 4... I mean, we're essentially there already with Google Gemma models.
As a software engineer, I didn't notice any difference in my productivity since Sonnet. Of course Opus is better and I'm sure Fable is better yet, but we're already hitting diminishing returns in terms of economic value.
I went from Cursor running one of the earlier GPT models to Claude Code on Sonnet and that was essentially a 5x productivity boost for me. Before Claude Code, I only used AI for small snippets. With Claude Code + Sonnet, I could trust it for entire sub-tasks... But I still don't trust Opus with full end-to-end features. I'm not sure it will ever get there. It probably doesn't need to.
Companies need software engineers to have a certain moderately high level of talent but above that level, they really don't care AT ALL. They don't even notice the difference, even if the gap is significant.
> Open source 'winning' just means that there exists at least one open source alternative to closed models which is as good as, say, GPT 4... I mean, we're essentially there already with Google Gemma models.
Is this really true? We just don't know what the maximum capability of AI is. If it turns out AI can be as intelligent and capable as something like Data from Star Trek, no one is going to be thinking GPT 4 is good enough.
>>We just don't know what the maximum capability of AI is
For all theory purposes there is no limit. Thats what the latest loop engineering trend is about, you are asking AI to find solutions to a problem going by listing steps, and if solution not found in those steps, to treat each step as a separate problem and repeat the process until the master solution to the master problem is found.
Once a solution is found, or new data/insights are generated through this process, the LLM can be trained on this. So in theory you can just keep going like this forever.
Secondly. This is as close to agency you can build inside a machine.
Practically speaking, hardware is a limit. But that can scale up with time.
So we are already looking at some kind of runaway intelligence even if not sentient.
Yeah, the latest models are really good. For implementing leetcode-type solutions, Claude Opus is smarter than essentially all engineers I've ever worked with and smarter than me as well. The one area where I beat it hands-down is technical decision-making; it sucks at architecture, maintainability, performance and scalability.
Agency seems to correlate with the ability to make good decisions. It's kind of surprising how much agency is required to make good technical decisions. It's not even about business domain knowledge; a lot of agency is needed even in a pure tech context.
It could get really smart but I'm confident in my thesis that surplus intelligence beyond a certain level doesn't yield any real economic benefits.
At scale, I can see a benefit in terms of being able to process large amounts of data intelligently to gain a competitive advantage in terms of accruing nominal gains but I think that as long as AI is pursuing dollars, those gains won't translate to real value to the people who control the AI. At best, will translate to more political control; but with added risks and threats too. I suspect it will look more like controlled decline with a small number of entities getting an increasingly large slice of a rapidly shrinking pie.
I think AI may just figure out really complex ways to legally steal people's money. It will probably look all legit on the surface, it will look like the majority of people are just freakishly unlucky and a tiny number of elites are just extremely lucky... But it will be AI behind the scenes orchestrating seemingly random events; choosing who gets lucky and who doesn't.
Might end up literally like a game of monopoly. One player could dominate the game and start receiving all the money but, if you look at the big picture, none of the players are doing anything economically useful; just sitting around a board and moving pieces of paper amongst each other.
It's like the industrial revolution. Many kings and emperors did not like the idea of industrialization because they were already living a luxurious life and understood that it would not benefit them and would only create risks and problems for them personally. They could already afford as many human servants than they needed, what was the point of replacing them with machines to provide the same service they already received? It would give their servants more free time? To an emperor, that would have sounded more like a problem than a solution. It's a bit like that with AI. The people who control AI won't benefit from it beyond what they already have. If it doesn't serve a social cause then it serves nobody.
The Gemma models are tiny, not really comparable to DeepSeek Pro, Kimi or GLM. But the broader point stands.