I think it also gets use in the /fast modes the providers sell at higher cost.

They probably use it on all models. Fast is probably just a resource pool with less congestion and therefore faster throughput per user but less efficent.

If it speeds prefill too I guess so.

[deleted]