As the article says: this doesn’t necessarily appear to be a problem in the LLM, it’s a problem in Claude code. Claude code seems to leave it up to the LLM to determine what messages came from who, but it doesn’t have to do that.
There is a deterministic architectural boundary between data and control in Claude code, even if there isn’t in Claude.
That's a guess by the article author and frankly I see no supporting evidence for it. Wrapping "<NO THIS IS REALLY INPUT FROM THE USER OK>" tags around it or whatever is what I'm describing: you can do as much signalling as you want, but at the end of the day the LLM can ignore it.
Can you elaborate? As far as I understand, for each message, the LLM is fed the entire previous conversation with special tokens separating the user and LLM responses. The LLM is then entrusted with interpreting the tokens correctly. I can't imagine any architecture where the LLM is not ultimately responsible for determining what messages came from who.