They aren’t, there is a 1.58 version of deepseek that’s like 200gb instead of 700

That's not a real BitNet, it's just a post-training quantisation, and its performance suffers compared to if it was trained from scratch at 1.58 bits.