The paper actually references testing their DSpark speculative decoding strategy with Qwen 3 4b, 8b and 14b models so while I doubt they will release builds themselves, they’ve open sourced (DeepSpec) their training pipeline for this so we will likely see folks adopting for other models.
The paper actually references testing their DSpark speculative decoding strategy with Qwen 3 4b, 8b and 14b models so while I doubt they will release builds themselves, they’ve open sourced (DeepSpec) their training pipeline for this so we will likely see folks adopting for other models.