I can't emphasize this enough, it doesn't matter how good a model is or what CLI I'm using, use git and chroot (at the least, container is easier though).
Always make the agent write a plan first and save it to something like plan.md, and tell it to update the list of finished tasks in status.md as it finishes each task from plan.md and to let you review the change before proceeding to next task.