Hacker News

Y

Hacker News

new | ask | show | jobs

storus 10 hours ago [ - ]

4-bit quantization is not applied to all layers, some are kept 8/16-bit.