How much does /goal actually help? In auto mode, I've tried using and not using /goal and I haven't felt a difference.

https://code.claude.com/docs/en/goal#how-evaluation-works

> /goal is a wrapper around a session-scoped prompt-based Stop hook. Each time Claude finishes a turn, the condition and the conversation so far are sent to your configured small fast model, which defaults to Haiku. The model returns a yes-or-no decision and a short reason. A “no” tells Claude to keep working and includes the reason as guidance for the next turn. A “yes” clears the goal and records an achieved entry in the transcript.

> The evaluator runs on whichever provider your session is configured for. It does not call tools, so it can only judge what Claude has already surfaced in the conversation.

Apparently, it uses Haiku (by default) to evaluate every turn to determine if the goal has been achieved. However, it only relies on the transcript itself (including the reasoning of the main model). It can't independently verify if the goal has been achieved. So, if the main model thinks the goal is or isn't done, how often does Haiku disagree (in a productive way)? That's not clear to me.

I've mainly used the feature in codex, where I've been able to get it to work for 5 straight days (with breaks when rate limits are hit -- which was surprisingly only thrice) on a massive port.

I don't know how well it works in claude code, but I wouldn't be worried about Haiku getting it wrong and don't see a problem with it relying on the transcript. I always set these things up to maintain a checklist of subtasks to do in a file and check them off, and to always implement with red/green testing methodology, where it writes and commits failing tests, then writes the feature/fixes the bug and commits with passing tests and with an updated checklist file.

So the model should always know from the transcript whether the current task is done by whether it shows the tests passing, and it should always know if there's more tasks left from the checklist file being updated before the commit.