Not everyone lives in a place where electricity is $0.20 a kWh. For instance BC Hydro residential rates are $0.11 (CAD) for the first tier and $0.14 for the second tier of consumption in a month. At current exchange rate $0.14 CAD is $0.099 USD a kWh. Hydro Quebec is even cheaper.

At a theoretical 6 tok/s, 86400 seconds in a day, approx 500,000 tokens of GLM5.2 output for 2 bucks a day seems like a pretty good bargain to me. Of course not counting the one time cost of the hardware to run it. But I see people dropping $4000-5000 on all kinds of much less useful stuff.

Additionally in a place where people use electric baseboard heating or electric in floor radiant heating, or really any other heating element based system in winter that's less efficient than a heat pump, additional electrical from a computing load is basically "free" since you would be spending that same money otherwise to heat your house. If a computer with 512GB of RAM is dumping the waste heat into your room, it accomplishes a portion of the same thing as a baseboard.

Not to mention there is a whole other less measurable benefit of having a locally hosted model that can't be turned off or arbitrarily restricted by a service provider, and where all of your queries and context cache aren't subject to surveillance by any third party.

Unless the token estimates I get from using Claude are wayyy out, I burn through 5m+ tokens/day, and I'm not doing a lot of time. 500k tokens in a 24h period for $5k of hardware seems quite poor?

Be sure you compare inputs tokens to pre-fill rates and output tokens to generation rates.

Where I live prices are often higher than 20c/kWh, but lets take your example and halve it (10c/kWh) so it's ~$1.40/day or ~$500/year.

On Openrouter, the cheapest GLM 5.2 provider costs $3/MTok (at 44 tps). Assuming most use is output tokens, that's still the equivalent of 450k token/day, so we're in the same ball park, but without the capex for 2 3090's and the machine.

Self hosted only makes economic sense if your priority is being in control / avoiding surveillance.

That's true, there's a lot of places where power is considerably more expensive than $0.20 USD/kWh. But also the 600W figure assumes that it's fully loaded 24x7x365.

Running a system that will be 600W under max CPU usage on all cores and RAM and a few 3090-class GPUs, that same system might be only 90W or around there when idle at 0.00 unix load.

If we say: (600 * 24 * 31)/1000 = 446kWh in a month at full load 24 hours a day

But it could be less, such as: (90 * 12 * 31)/1000 = 33.48 kWh of idle time in a month, and 223kWh of "full load" 600W time in a month, if it's at full load only 12 hours a day.

If you're the only user accessing it and you only "use" it 12 hours a day, that cumulative yearly dollar figure would be almost halved. Or even less if a person is using it in bursts and intermittently throughout an 8 hour workday.

The usage is irrelevant if we're interested in cost per token. If you use it half as much, you get half as many tokens at half the cost. It's still $5.56 in electricity per million output tokens either way (using $0.20/kWh, adjust accordingly if you have cheaper electricity). If you use the API, you also pay half as much if you use half as much.

> person is using it in bursts and intermittently throughout an 8 hour workday.

You can’t do that with 6 tps, though.

I think that's the biggest difference for most. If you can amortize the hardware costs, then 'burst usage' is cheaper at home to a degree, because you are paying a fixed monthly rate elsewise. Overall thought for most, it is likely cheaper to use the cloud than at home, but really depends on what you want.

> because you are paying a fixed monthly rate elsewise

No, you would pay usage based rates with API, in this case. I have exactly one fixed monthly rate for the 6 AI models I have tokens available for.

> But also the 600W figure assumes that it's fully loaded 24x7x365.

It isn't 100% efficient. Even the best PSUs aren't.

Lots of people have solar. Green AI, imagine that!

if only there was a magical place where geothermal and hydroelectric is ubiquitous and the weather is cold enough that no one is going to be complaining about free heating.

The largest geothermal plant in the world is only 1.5GW, in the United States, which is over double all the plants combined in Iceland. The second largest is 1/3 that, in Mexico. [1]

There is no "ubiquitous" geothermal where there also high power usage. Data centers have to go where power is, not can be.

[1] https://en.wikipedia.org/wiki/List_of_geothermal_power_stati...

Related, it should surprise no-one that the tech giants are interested in nuclear [1], including small reactors [2], rather than waiting for the utility monopolies [3] to raise an arm and actually generate more power [4].

[1] https://www.cnbc.com/2025/03/12/amazon-google-and-meta-suppo...

[2] https://www.sciencenews.org/article/small-modular-nuclear-re...

[3] https://floodlightnews.org/fraud-and-corruption-on-rise-at-u...

[4] https://decarbonization.visualcapitalist.com/animated-70-yea...

To be fair, Vancouver is such a magical place in terms of electrical cost, but the cost of living and real estate are otherwise through the roof, with decrepit and nasty (would need $100k in renovations immediately if you're not treating it as a teardown) single family detached homes on the east side of the city selling for 3.2 million.

Yeah there's a reason our datacentres are in Kamloops, cheap housing and a big ass river right next to it. It even gets decently cold in the winter so you can save on cooling.

There's also tons of opportunity to build them out in former pulp mill towns on Vancouver Island that have big interconnects or dedicated generation.

You'd have to be an idiot to put a datacentre in Vancouver, or have fuck-off scale monopoly money, which is probably why Telus is doing it.

Shhh don't forget we have a water shortage. But it is nice to have electricity wrapped into my relatively cheap basement suite rent ;)

You aren't, perchance, from Iceland, are you?