One issue with that is that human helpers last longer. LLMs cycle in and out in months, and what held for Your Favorite LLM 6.7 may not hold for Your Favorite LLM 6.9.
One issue with that is that human helpers last longer. LLMs cycle in and out in months, and what held for Your Favorite LLM 6.7 may not hold for Your Favorite LLM 6.9.
Right, this is why I would slam the breaks on investing into your workflow all of your time and effort, because 2 months from now it may be out the window. Frontier models are also constantly being tweaked, so what worked yesterday may be off today.
ChatGPT was obedient with the grill-me technique, just wrote a plan. Yesterday it started jumping to implementation. Why?
I find that when an LLM jumps into tasks it was not told to do (or even worse, doing things it was explicitly told not to), it is a good sign the context is too full, and you should do a controlled hand-off to a new instance.
I wipe my context relentlessly. I never have long-running conversations. In and out like Seal Team Six.