Hardware isn’t improving exponentially anymore, especially not on the flops/watt metric.

That’s part of what motivated the transition to bfloat16 and even smaller minifloat formats, but you can only quantize so far before you’re just GEMMing noise.