AI improving itself (or at least the architecture it runs on), the singularity is near as they say.
Do we have other examples of AI being used to improve the LLMs, apart for the creation of synthetic data and the testing of the models?
AI improving itself (or at least the architecture it runs on), the singularity is near as they say.
Do we have other examples of AI being used to improve the LLMs, apart for the creation of synthetic data and the testing of the models?
There is an apples and oranges difference between AI improving itself (becoming more capable) and AI optimizing software that happens to be used for AI training or inference.
A more efficient transformer just costs less to run.
"AI improving AI" would be if one generation of AI designed a next-gen AI that was fundamentally more capable (not just faster/cheaper) than itself. A reptilian brain that could autonomously design a mammalian brain.
Even when hooked up into a smart harness like AlphaEvolve, I don't think LLMs have the creativity to do this, unless the next-gen architecture is hiding in plain sight as an assemblage of parts than an LLM can be coaxed into predicting.
More likely it'll take a few more steps of human innovation, steps towards AGI, before we have an AI capable of autonomous innovation rather than just prompted mashup generation.
I don't think there is a fundamental divide between implementation speedups and optimization and algorithmic/architecture optimizations
A speedup that changes nothing else is just that: a speedup that changes nothing else.
> Do we have other examples of AI being used to improve the LLMs
Yes, last year when they revealed AlphaEvolve they used a previous gemini model to improve kernels that were used in training this gen models, netting them a 1% faster training run. Not much, but still.
Self improving, doesn’t necessarily imply singularity right?
There still could be hard constraints to make singularity intractable or just such a long time horizon it’s not practical right?
I feel like the most viral lately is https://github.com/karpathy/autoresearch
> AI improving itself
This is the thing to look for in 2027, imho. All the big AI labs have big projects working on research agents, also specifically into improving AI (duh) and I expect a lot of that to get out of the experimental phases this year.
Next year they actually get to do a lot of work and I think we will see the first big effective architectural change co-invented by AI.
And then on 2028 we will be selling ice cream at the beach.
Shameless plug: https://huggingface.co/spaces/smolagents/ml-intern
It’s a simple harness around Opus, but with tight integration to Hugging Face infra, so the agent can read papers, test code and launch experiments
What are the benchmarks for this, in terms of costs of computation and error; cost to converge?
Re: hyperparameter tuning and autoresearch: https://news.ycombinator.com/item?id=47444581
Parameter-free LLMs would be cool
Singularities are a sign that you have a broken model.
The hard part about this is for every few 'WOW', there's a lineage of 'you dumbass'.
I mean, if you can create aharrness to filter these two, sure, singularity away; it's really hard to see how someones gonna do that.