Arithmetic coding looks like an extremely interesting approach, given that you can use the model at each step to give you the probabilities of each token.