Hacker News

new | ask | show | jobs

erispoe 6 hours ago [ - ]

I have, mostly, long running autonomous tasks, so it doesn't matter how slow inference is. If I optimize for latency it means I'm turning into the limiting factor.