Hacker News

Yes but unfortunately a lot of the discussion that people participate in, are not done from a corporate point of view, but from a normal consumer level.

And there is a lot of drama in those discussions. GLM 5.2 is a great model for corporations to run, but people only want to hear about running a 35B/27B or maybe a 120B model. And in that market, subscription services are simply way better value for money (take in account the privacy issues).

Everybody wants GPT 5.5/Opus 4.8 Max levels, on a budget that simply is not realistic. And GLM fit in that 4.8 medium/low level.

But then people do not want to be told that running a 750b model in Q2 or Q1 is just going to destroy the models accuracy. And that is still going to cost them 5k+ for that reduced model.

The whole local llm landscape from a consumer point of view, is just filled with odd people. lol.

Corporation really benefit from those models, because spending $90k on a server, is a deductible expense. And they are billed at token prices anyway from all the major providers. So its a even faster ROI on that hardware.

I am surprised that nobody figured out to make a business of selling leftover capacity from corporate llm installations, because there is easily 12h+ just wasted (unless its a large corp that has people in all timezones).