With speculative decoding you can use more models to speed up the generation however.