If you want to make it more human-explainable, then ditch the entire tokenizer and just feed the models raw characters. Because now there is nothing to explain.