I wonder if it is just sort of detecting a weird distribution in the state and that it wouldn’t be able to do it if the idea were conceptually closer to what they were asked about.

That "just sort of detecting" IS the introspection, and that is amazing, at least to me. I'm a big fan of the state of the art of the models, but I didn't anticipate this generalized ability to introspect. I just figured the introspection talk was simulated, but not actual introspection, but it appears it is much more complicated. I'm impressed.