It's 3 cycles for float multiplication (and 1 for shift right):
3x faster
In throughput it's even less of a difference: 2 per cycle vs 3 per cycle.
50% faster
It's 3 cycles for float multiplication (and 1 for shift right):
3x faster
In throughput it's even less of a difference: 2 per cycle vs 3 per cycle.
50% faster