> 4 bit quantized 120B model on a 96GB workstation card, the Blackwell Pro workstation

Would be interesting to know how it performs in terms of quality and token/sec.