> LLMs ... lack the ability to hold a series of precepts "in mind" and stick to those precepts.
That is perhaps the biggest weakness I've noticed lately, too. When I let Claude Code carry out long, complex tasks in YOLO mode, it often fails because it has stopped paying attention to some key requirement or condition. And this happens long before it has reached its context limit.
It seems that it should be possible to avoid that through better agent design. I don't know how to do it, though.