It requires manual allocation of an array of tokens. So it needs a backing "stack vector" of sorts.

And what about escapes?

For escapes you can mutate the raw buffer with data in place, since a single escape always expands to fewer characters than the escape itself.