Hacker News

The 1.58bit quantization is using 3 values -- -1, 0, 1. The bits number comes from log_2(3) = 1.58....

For that level you can pack 4 weights in a byte using 2 bits per byte. However, there is one bit configuration in each that is unused.

More complex packing arrangements are done by grouping weights together (e.g. a group of 3) and assigning a bit configuration to each combination of values into a lookup table. This allows greater compression closer to the 1.68 bits value.