> Why not scale to fill the available bins, though? i.e. trunc(result * 255.999)?

That’s half of the mid-riser staircase quantizer discussed in the article. (The other half is coming up with the reverse.)

(I would implement it as min(floor(x * 256), 255).)

Kind of. According to the article, the mid-riser vs mid-tread distinction correlates to where the 0.5 is applied in the transform. I’m proposing that there is no 0.5 applied at all. Instead of counteracting the compression of the truncation operation by adding a fixed offset, it multiplies by a scale factor.

Possibly my proposal doesn’t hold up to repeated transforms and operations. It might skew toward 255 in real operations.

The 0.5 comes from debiasing the round-trip.

If your conversion from high precision -> 8-bit is just multiplication by 256 and then truncation, then you’ve got the mid-riser quantizer. The +0.5 comes from interpreting a value of 0 as bucket from 0-1, just like the value of 255 is the bucket from 255-256. It’s introduced in the conversion back from 8-bit to high precision.