Hacker News

Where I live prices are often higher than 20c/kWh, but lets take your example and halve it (10c/kWh) so it's ~$1.40/day or ~$500/year.

On Openrouter, the cheapest GLM 5.2 provider costs $3/MTok (at 44 tps). Assuming most use is output tokens, that's still the equivalent of 450k token/day, so we're in the same ball park, but without the capex for 2 3090's and the machine.

Self hosted only makes economic sense if your priority is being in control / avoiding surveillance.

walrus01 13 hours ago [ - ]

That's true, there's a lot of places where power is considerably more expensive than $0.20 USD/kWh. But also the 600W figure assumes that it's fully loaded 24x7x365.

Running a system that will be 600W under max CPU usage on all cores and RAM and a few 3090-class GPUs, that same system might be only 90W or around there when idle at 0.00 unix load.

If we say: (600 * 24 * 31)/1000 = 446kWh in a month at full load 24 hours a day

But it could be less, such as: (90 * 12 * 31)/1000 = 33.48 kWh of idle time in a month, and 223kWh of "full load" 600W time in a month, if it's at full load only 12 hours a day.

If you're the only user accessing it and you only "use" it 12 hours a day, that cumulative yearly dollar figure would be almost halved. Or even less if a person is using it in bursts and intermittently throughout an 8 hour workday.

nearbuy 4 hours ago [ - ]

The usage is irrelevant if we're interested in cost per token. If you use it half as much, you get half as many tokens at half the cost. It's still $5.56 in electricity per million output tokens either way (using $0.20/kWh, adjust accordingly if you have cheaper electricity). If you use the API, you also pay half as much if you use half as much.

wqaatwt 11 hours ago [ - ]

> person is using it in bursts and intermittently throughout an 8 hour workday.

You can’t do that with 6 tps, though.

AbsurdCensor 11 hours ago [ - ]

I think that's the biggest difference for most. If you can amortize the hardware costs, then 'burst usage' is cheaper at home to a degree, because you are paying a fixed monthly rate elsewise. Overall thought for most, it is likely cheaper to use the cloud than at home, but really depends on what you want.

nomel 6 hours ago [ - ]

> because you are paying a fixed monthly rate elsewise

No, you would pay usage based rates with API, in this case. I have exactly one fixed monthly rate for the 6 AI models I have tokens available for.

re-thc 7 hours ago [ - ]

> But also the 600W figure assumes that it's fully loaded 24x7x365.

It isn't 100% efficient. Even the best PSUs aren't.