I think this is precisely why I don't mind it that much. I can't audit a huge codebase like a JavaScript runtime, whether the code is by a human from scratch or not. I just have to trust it as a black box.
I've seen LLMs produce terrible code indeed, but I have also seen humans produce terrible code. I haven't dug in to JS runtimes specifically but have read plenty of code in openjdk and cpython - there are many points that could be done better, but there's also no point since it's working, and keeping working code unchanged tends to be a smart decision in software engineering.
So of course the last point brings up whether it was a good idea to rewrite bun if it was working. Apparently the bun team thought the difficulty in getting changes in zig upstream meant it is. I don't intend to hold LLM code to a higher bar than human code - notably if the runtime continues to work, that is as good as I can expect from what is otherwise a huge black box of extreme programming (not that agile kind).
The difference is you can evaluate a small bit of the output of a human or a team of humans and expect all their other code to be roughly in the same ballpark of quality.
An LLM can’t be trusted to produce code and make higher level project structure choices of the same quality at all times, because it can’t be trusted at all - trust is for deterministic systems. But still it begs us to trust it. Every prompt that yields good results sets us up to expect good results, so we get lazy - and then the next prompt it spews out garbage.
As long as the odds are good enough (and/or you know the distribution), there is nothing wrong in relying on and profiting from stochastic systems despite not every outcome being positive. What matters is the sum of outcomes, not the individual ones.
It means you need to be able to handle failure, but you should always have a good grip on how to correct if you intend to set things out in the real world which messes up everything always anyways.
Sure, but that’s not how most llm coding is done, because if a human has to carefully supervise the llm then what’s the point - might as well write it yourself.
Add to that, we’re very good at anthropomorphizing, and very bad at supervising systems that are usually right. Makes for a mess.
Oh, and this all relies on the ai providers not changing things up behind the scenes and feeding you a dumber model sometimes.