In case you are unfamiliar with Karpathy's Loop[1], it is a genetic algorithm[2] where the genetic "mutations" are clever-but-random ideas generated by an LLM agent, aimed at improving a system.

  (1) Let the LLM randomly perturbate the system.
  (2) Measure the system's performance.
  (3a) If the perturbation improved performance, keep the change.
  (3b) Otherwise, don't.
  (4) Repeat
[1] https://github.com/karpathy/autoresearch

[2] https://en.wikipedia.org/wiki/Genetic_algorithm

I was working some time ago on LLM assisted optimizations and algorithm discovery and this does not look like a novel idea.

AlphaEvolve from google is evolutionary algorithm which uses LLMs for Idea generation following very similar loop:

- https://deepmind.google/blog/alphaevolve-a-gemini-powered-co...

- Open source implementation of the algorithm: https://github.com/algorithmicsuperintelligence/openevolve

It is not novel - but with the new models it is just becoming practical.

I mean, this is such low hanging fruit, you have to be careful not to step on it.

Just because it is a nice meme I want to throw in Schmidhuber's work on (do not treat this comment as serious except you are Schmidhuber himself):

* Gödel Machine (2006-2007) [1]

* Optimal Ordered Problem Solver (2002) [2]

* Meta-Learning and Artificial Curiosity (1990s onward) [3]

[1] https://arxiv.org/html/2505.22954v3

[2] https://arxiv.org/abs/cs/0207097

[3] https://evolution.ml/pdf/schmidhuber.pdf

Edit: markdown formatting

Nice references! tks

A genetic algorithm keeps a population, and there is a "crossing" operation.

I don't see both ingredients in Karpathy's proposed scheme.

That's not a genetic algorithm, that's stochastic gradient descent.

To be a genetic algorithm it would need to have mutation (which you have here) and crossover (which you don't).

I agree it's not a genetic algorithm, but, it's also not stochastic gradient descent. There is no gradient. The "step direction" (code modification) is chosen by an LLM, which is "smart enough" to guess something that might be an improvement.

It is rather a variation of hill climbing [1]. As others pointed out evolutionary algorithms employ a richer set of search operators.

[1] https://en.wikipedia.org/wiki/Hill_climbing

This is like idiocracy for Software Devs at this point

Is it? Evolution also seems to be a result of semi-random crap over the span of millenia and nobody is critiquing it like that.

Why should throwing ideas at the wall in regards to optimizing code be any different: as long as you can measure and verify it, are okay with added complexity, and are capable of making the code itself not be crap by the end of it?

If an approach is found that improves how well something works, you can even treat the AI slop as a draft and iterate upon it yourself further.

It's basically saying to randomly slop something and see if it gets better. Evolution has physical principles and guard rails backing it. Here there are no principals whatsoever, just slopping the slopper to see if it's somehow less sloppy then writing a gist with a slop machine.

I wouldn't call it karpathys loop I'd call it slop descent. Or descent into slop. Or something like that

Evolution very much involves random mutations that turn out useless or harmful and thus don't spread.

This is in fact less random than how generic algorithms used to work traditionally which encoded behaviors in some data structure that then got randomly mutated or crossed with other candidates in the pool.

I am aware of what biological evolution is. This isn't analogous. I love my software friends, I'm a software person now too, but the level at which people take algorithms that involve any level of biomimicry as a model for actual biology is frustrating.

Is slop verifiable? If so we can throw it in the loop... The point is that this loop can be pointed at any verifiable work. Yeah you are seeing it raw, the verifier is the principle you talked about. Yes it was fully AI generated, It will be refined

It does burn holes in ones brain doesn't it... At least with the silly sorting algorithms we know they are supposed to be silly...

i actually do it differently

> (1) Let the LLM randomly perturbate the system.

instead of this i ask LLM to what's least likely to improve performance and then measure it.

sometimes big gains come from places you thought are least likely.

For sure! The hypothesis generation gotta be improved. Your take on the "least likely" is interesting. In the beginning of the repo I was having problems with "hypothesis convergence", your idea may be a nice way to introduce the much needed variability

You're missing a step. The perturbations are not fully random. The LLM also looks at the result and tries to do credit assignment to determine what changes to try in the next round.

Lol, I respect karpathy a lot, but this is such an obvious in your face idea that it is laughable to put someone’s name on it.

What’s next “karpathy investing” where ai in a loop builds a portfolio?

I believe Karpathy himself called it autoresearch, not Karpathy Loop, but in a vacuum of names around AI it seems to be very easy to meme-drop a name and then come influencer efforts to cool-name and normalize it. See vibecoding*

I'd go a step further and say that sort of loop is probably the first thing most people who play around with agent harnesses try, pretty much the first "Hmm, what should I do now?" thing that pops into people's head.

It's less the idea and more the simplicity of it. It's a distillation of something that works and lets newer practitioners get their feet wet before moving on to more complex implementation.

actually having a harness for it is nice though, vs having yourself prompt it interatively.

Call it a K-loop please. Where ai in a k-loop builds a portfolio

[dead]

Wtf, this has a name now? I thought of this exact idea literally months ago but never had the time to do any experiments on it.

At the time I dismissed it as potentially being incredibly expensive for the improvement you do get, and runs into typical pitfalls of evolutionary algorithms (in the same way evolution doesn't let an organism grow a wheel, your LLM evolution algorithm will never come up with something that requires a far bigger leap than what you allow the LLM to perturb on a single step. Also the genetic algorithm will probably result in a vibecoded mess of short-sighted decisions just like evolution creates a spaghetti genome in real life.)

I'll definitely need to look into how people have improved the idea and whether it is practical now.

This is not a new idea at all, many many have had it, no one really can claim it

Stigler's law of eponymy https://en.wikipedia.org/wiki/Stigler%27s_law_of_eponymy

Wikipedia has humor:

> The same observation had previously also been made by many others.

I genuinely laughed reading the first words. Yeah, its hard to be novel

Don’t worry, Twitter bros already coined it.

Genetic algorithms have existed since the 60s / 70s, e.g. computers learning to play a game. LLMs aren’t particularly guide at it.

I think hyperparameter tuning may actually be a kind of genetic algorithm.

Hyperparameter tuning could be done by genetic algorithm. I think it’s a bit of a category error to say that it is a genetic algorithm though.

Hyperparam tuning is usually done by Bayesian Optimization though.

Yeah that’s correct, it could use it, but there are better alternatives for this particular problem.

You know this doesn’t work most of the time…

thanks, I thought as a researcher Kaparthy would include and cite relevant papers. I quickly became disappointed. I already knew openevolve and the ACE Framework paper. This is the first time I learned about Genetic Algorithm and I now have some clear roadmap for studying.