Hacker News

Could be, by "meaning" I mean (heh) that transformers are able to distinguish tokens (and prompts) in a consequential ("causal") way, and that they do so at various levels of detail ("abstractions").

I think that's the usual understanding of how transformer architectures work, at the level of math.