Partially true, you can predict multiple tokens and confirm, which typically gives a 2-3x speedup in practice.
(Confirmation is faster than prediction.)
Many models architectures are specifically designed to make this efficient.
---
Separately, your statement is only true for the same gen hardware, interconnects, and quantization.