This is exactly what I find too, I make plans in both models and compare them in the other model. And Claude usually agrees (65-80% of the time) that the Codex plan included things it didn't think of, or was better in some other way.

Note, this is better than it was with Opus, where it was more like 90% of the time the Codex plans were obviously better.