There is an apples and oranges difference between AI improving itself (becoming more capable) and AI optimizing software that happens to be used for AI training or inference.
A more efficient transformer just costs less to run.
"AI improving AI" would be if one generation of AI designed a next-gen AI that was fundamentally more capable (not just faster/cheaper) than itself. A reptilian brain that could autonomously design a mammalian brain.
Even when hooked up into a smart harness like AlphaEvolve, I don't think LLMs have the creativity to do this, unless the next-gen architecture is hiding in plain sight as an assemblage of parts than an LLM can be coaxed into predicting.
More likely it'll take a few more steps of human innovation, steps towards AGI, before we have an AI capable of autonomous innovation rather than just prompted mashup generation.
I don't think there is a fundamental divide between implementation speedups and optimization and algorithmic/architecture optimizations
A speedup that changes nothing else is just that: a speedup that changes nothing else.