It was when I started using Github Copilot in "Agent Mode" that my LLM productivity gains went from like 5x to 30x. People who are just using a chatbot get like 5x gains. People who use "Agent Mode" to write up a description of a new feature that would take several days by a human, but get it done in one click by an Agent, are getting 30x or more.

The amount of pushback I got on this thread tells me most devs simply haven't started using actual Agents yet.

I’ve tried using agents. LLMs just can’t reliably accomplish the tasks that I have to do. They just get shit wrong and hallucinate a ton. If I don’t break the task down into tiny chunks then they go off the rails.

This can definitely happen, because the context windows even in a great Agent can become flooded. I often do prompts like "Add a row of buttons at the top right named 'copy', 'cut', and 'paste'", and let the Agent do that, before I implement each button, for example.

The rule of thumb I've learned is to give an Agent the smallest possible task at a time, so there's zero ambiguity in the prompt, and context window is kept small.