They ask a yes/no question and inject data into the state. It goes yes (20%). The prompt does not reveal the concept as of yet, of course. The injected activations, in addition to the prompt, steer the rest of the response. SOMETIMES it SOUNDED LIKE introspection. Other times it sounded like physical sensory experience, which is only more clearly errant since the thing has no senses.
I think this technique is going to be valuable for controlling the output distribution, but I don't find their "introspection" framing helpful to understanding.