I’ve tried using agents. LLMs just can’t reliably accomplish the tasks that I have to do. They just get shit wrong and hallucinate a ton. If I don’t break the task down into tiny chunks then they go off the rails.

This can definitely happen, because the context windows even in a great Agent can become flooded. I often do prompts like "Add a row of buttons at the top right named 'copy', 'cut', and 'paste'", and let the Agent do that, before I implement each button, for example.

The rule of thumb I've learned is to give an Agent the smallest possible task at a time, so there's zero ambiguity in the prompt, and context window is kept small.