> I’ve often heard, with decent reason, an LLM compared to a junior colleague. But I find LLMs are quite happy to say “all tests green”, yet when I run them, there are failures. If that was a junior engineer’s behavior, how long would it be before H.R. was involved?

Reminds me of a recent experience when I asked CC to implement a feature. It wrote some code that struck me as potentially problematic. When I said, "why did you do X? couldn't that be problematic?" it responded with "correct; that approach is not recommended because of Y; I'll fix it". So then why did it do it in the first place? A human dev might have made the same mistake, but it wouldn't have made the mistake knowing that it was making a mistake.

It did so because his training, weights and context and a billion of matrix calculations led him there.

I think this gets lost on far too many discussions. There is no “why” - it’s statistics, and the closest we have to a “why” is that a lot of code out there sucks