Which slice? The large text compression benchmark uses enwik8 for a "smaller" input that is easily reproducible. The predictability of enwik9 can vary significantly depending on where in the file you are, as shown by Matt Mahoney https://www.mattmahoney.net/dc/textdata.html