Yes, speculative decoding will make both us and VLLM faster, but we believe it would be a relatively even bump on both sides, so we didn't include it in this comparison. Worth another test!
Yes, speculative decoding will make both us and VLLM faster, but we believe it would be a relatively even bump on both sides, so we didn't include it in this comparison. Worth another test!