> There was only one small issue: it was written in the programming language and with the library it had been told not to use. This was not hidden from it. It had been documented clearly, repeatedly, and in detail. What a human thing to do.

"Ignoring" instructions is not human thing. It's a bad LLM thing. Or just LLM thing.

The work where I've done well in my life (smashing deadlines, rescuing projects) has so often come because I've been willing to push back on - even explicitly stated - requirements. When clients have tried to replace me with a cheaper alternative (and failed) the main difference I notice is that the cheaper person is used to being told exactly what to do.

Maybe this is more anthropomorphising but I think this pushing back is exactly the result that the LLMs are giving; but we're expecting a bit too much of them in terms of follow-up like: "ok I double checked and I really am being paid to do things the hard way".

To be fair, there is likely not much training data on the difficult conversations you need to handle in a senior position, pushback being one of them. The trouble for the agents is that it is post hoc, to explain themselves, rationalising rather than ”help me understand” beforehand.

It's not necessarily "ignoring" instructions, it's the ironic effect of mentioning something not to focus on, which produces focus on said thing. The classic version is: "For the next minute, try not to think about a pink elephant. You can think about anything else you like, just not a pink elephant."

https://en.wikipedia.org/wiki/Ironic_process_theory

Yes exactly. But for llms it's more that it's not really "thinking" about what it's saying per se, it's that it's predicting next token. Sure, in a super fancy way but still predicting next token. Context poisoning is real