How fast is it?

2000 t/s prompt processing and 40-50 t/s generation. We should see 60-70 t/s generation with DSpark support solidifying in vLLM in a few days

Recent discussion on DSpark: https://news.ycombinator.com/item?id=48696585

[deleted]