hmmm makes me wonder if you could train llms on gzipped text. would save a lot of tokens that way.