With a sufficiently complex prompt and a sufficiently complex codebase, LLMs consistently fail and make mistakes, "forget" parts of the prompt, etc.

There's no comparison to be made between this and, for example, a compiler. It's an incompetent comparison.

> I don’t think you’re thinking about this clearly.

My literal job is dealing with layers of abstraction. I'm thinking pretty clearly when I tell you that, not only are LLMs a super leaky, terrible abstraction, they are also not comparable to any other layers of abstraction. All other layers of abstraction we use are well understood, predictable (as you put it), and DEBUGGABLE.

When claude deletes a fix it did two weeks ago, while trying to fix some unrelated error, do you never stop and think "this is not quite the same as what GCC does"?