Probably not. Everyone will still need a lot of reasoning tokens and tool calls. Running the tests for every round is tiring but must be done.
Probably not. Everyone will still need a lot of reasoning tokens and tool calls. Running the tests for every round is tiring but must be done.