GPT5.4 with any effort level is scary when you combine it with tricks like symbolic recursion. I actually had to reduce the effort level to get the model to stop trying to one shot everything. I struggled to come up with BS test cases it couldn't dunk in some clever way. Turning down the reasoning effort made it explore the space better.
can you explain what you mean by symbolic recursion tricks in this context?
The model can call a copy of itself as a tool (i.e., we maintain actual stack frames in the hosting layer). Explicit tools are made available: Call(prompt) & Return(result).
The user's conversation happens at level 0. Any actual tool use is only permitted at stack depths > 0. When the model calls the Return tool at stack depth 0 we end that logical turn of conversation and the argument to the tool is presented to the user. The user can then continue the conversation if desired with all prior top level conversation available in-scope.
It's effectively the exact same experience as ChatGPT, but each time the user types a message an entire depth-first search process kicks off that can take several minutes to complete each time.