Hacker News

ipsod 3 hours ago [ - ]

How fast is it?

wolttam 3 hours ago [ - ]

2000 t/s prompt processing and 40-50 t/s generation. We should see 60-70 t/s generation with DSpark support solidifying in vLLM in a few days

Recent discussion on DSpark: https://news.ycombinator.com/item?id=48696585

3 hours ago [ - ]

[deleted]