Hacker News

Y

Hacker News

new | ask | show | jobs

pinstripes 4 days ago [ - ]

Trying to scale text inference to 1 million tok/s on cheap hardware.