Hacker News

Is this newer/better than the speculative decoding from 2022? https://arxiv.org/abs/2211.17192

That paper is cited in the 'introduction' and 'background' sections. This paper is improving by removing some bottlenecks.

tiahura 3 hours ago [ - ]

Seems like they focus on improving the drafter and the verification policy so speculation keeps producing net speedups rather than wasted verification work at deepseek scale.