Quantization is a trade-off, though. The quality, while still perhaps good enough for many tasks, is not as good as the full 16-bit weights that the model was designed for/released with.
Quantization is a trade-off, though. The quality, while still perhaps good enough for many tasks, is not as good as the full 16-bit weights that the model was designed for/released with.
[dead]