Hacker News

TensorToad 2 hours ago [ - ]

Super low latency inference might be helpful in applications like quant trading. However, in an era where a frontier model becomes outdated after 6 months, I wonder how useful it can be.

TensorToad 2 hours ago [ - ]

Also, quant trading probably care more about embedding the content instead of generating output tokens