There's this[1]. Model providers have a strong incentive to switch (a part of) their inference fleet to quantized models during peak loads. From a systems perspective, it's just another lever. Better to have slightly nerfed models than complete downtime.
So - as the charts say - no statistical difference?
Isn't this link am argument against the point you are making?
The chart doesn't cover the 4.6 release which was in the end of December/early January time frame. So, it's hard to tell from existing data.