You're actually applying a grammar to the token. If you're outputting, for example, JSON, you know what characters are valid next (because of the grammar), so you just filter out the tokens that don't fit the grammar.