4-bit quantization is not applied to all layers, some are kept 8/16-bit.