> 4 bit quantized 120B model on a 96GB workstation card, the Blackwell Pro workstation
Would be interesting to know how it performs in terms of quality and token/sec.
> 4 bit quantized 120B model on a 96GB workstation card, the Blackwell Pro workstation
Would be interesting to know how it performs in terms of quality and token/sec.