Hacker News

Y

Hacker News

new | ask | show | jobs

tough 8 months ago [ - ]

but unlike cuda there's no custom kernels for inference in vllm repo...

I think