Hardware isn’t improving exponentially anymore, especially not on the flops/watt metric.
That’s part of what motivated the transition to bfloat16 and even smaller minifloat formats, but you can only quantize so far before you’re just GEMMing noise.
Hardware isn’t improving exponentially anymore, especially not on the flops/watt metric.
That’s part of what motivated the transition to bfloat16 and even smaller minifloat formats, but you can only quantize so far before you’re just GEMMing noise.