Hacker News

tornikeo 14 hours ago [ - ]

On paper. There's huge financial incentive to quantize the crap out of a good model to save cash after you've hooked in subscriptions.

armchairhacker 12 hours ago [ - ]

And there’s an incentive to publish evidence of this to discourage it, do you have any?

TeMPOraL 12 hours ago [ - ]

Models aren't just big bags of floats you imagine them to be. Those bags are there, but there's a whole layer of runtimes, caches, timers, load balancers, classifiers/sanitizers, etc. around them, all of which have tunable parameters that affect the user-perceptible output.

natebc 11 hours ago [ - ]

There really always is a man behind the curtain eh?

coldtea 10 hours ago [ - ]

Often it's literally just that:

https://www.msn.com/en-us/money/other/ai-startup-backed-by-m...

TeMPOraL 11 hours ago [ - ]

It's still engineering. Even magic alien tech from outer space would end up with an interface layer to manage it :).

ETA: reminds me of biology, too. In life, it turns out the more simple some functional component looks like, the more stupidly overcomplicated it is if you look at it under microscope.

woadwarrior01 11 hours ago [ - ]

There's this[1]. Model providers have a strong incentive to switch (a part of) their inference fleet to quantized models during peak loads. From a systems perspective, it's just another lever. Better to have slightly nerfed models than complete downtime.

[1]: https://marginlab.ai/trackers/claude-code/

nl 11 hours ago [ - ]

So - as the charts say - no statistical difference?

Isn't this link am argument against the point you are making?

withinboredom 10 hours ago [ - ]

The chart doesn't cover the 4.6 release which was in the end of December/early January time frame. So, it's hard to tell from existing data.

coldtea 11 hours ago [ - ]

Anybody with more than five years in the tech industry has seen this done in all domains time and again. What evidence you have AI is different, which is the extraordinary claim in this case...

seunosewa 5 hours ago [ - ]

Or just change the reasoning levels.