That means their paper is actually worse than SOTA, which is concerned with training in fp4 natively without full precision [0] for QAT.
[0] "full precision" in ML usually means 16 bit floats like bfloat16
That means their paper is actually worse than SOTA, which is concerned with training in fp4 natively without full precision [0] for QAT.
[0] "full precision" in ML usually means 16 bit floats like bfloat16
I wouldn't say "worse". It's focusing on inference cost and leaving training at a default for now.