Hacker News

baggiponte 8 months ago [ - ]

Yeah. The docs tell you that you should build it yourself, but…

tough 8 months ago [ - ]

but unlike cuda there's no custom kernels for inference in vllm repo...

I think