Trying to scale text inference to 1 million tok/s on cheap hardware.