> This bug is categorically distinct from hallucinations.
Is it?
> after using it for months you get a ‘feel’ for what kind of mistakes it makes, when to watch it more closely, when to give it more permissions or a longer leash.
Do you really?
> This class of bug seems to be in the harness, not in the model itself.
I think people are using the term "harness" too indiscriminately. What do you mean by harness in this case? Just Claude Code, or...?
> It’s somehow labelling internal reasoning messages as coming from the user, which is why the model is so confident that “No, you said that.”
How do you know? Because it looks to me like it could be a straightforward hallucination, compounded by the agent deciding it was OK to take a shortcut that you really wish it hadn't.
For me, this category of error is expected, and I question whether your months of experience have really given you the knowledge about LLM behavior that you think it has. You have to remember at all times that you are dealing with an unpredictable system, and a context that, at least from my black-box perspective, is essentially flat.