Hacker News

> Having a machine that can run some modest local LLMs, like the Gemma 4 12B, is really worth it.

Cloud models are (much) faster, they don't consume so much power/generate heat, they have much bigger (LLM) context, they're much more precise and they have a much wider (engineering) context of the given problem.

Except privacy and use cases that are blocked by cloud models (e.g. reverse engineering), local LLMs are currently an expensive toy.

When I try to program with a local LLM (I'm on a 32/128 GB system), I end up wasting time compared to a cloud LLM.

dofm a day ago [ - ]

Again, I would not argue against any of this.

And I can't say that I won't switch to openrouter (even just for the same models) at some point.

But one of the things I have found about my own process learning is that some lessons only come to you when you make yourself available to them. And if that means doing things the difficult way, that is what you should do.

wahnfrieden a day ago [ - ]

Difficult... and wastefully expensive

sanderjd a day ago [ - ]

Seems like an investment into building expertise, which is likely to have high ROI in the future, rather than a wasteful cost.

dofm a day ago [ - ]

I mean, it's a (secondhand) computer I bought for other tasks (processing very large photos, compiling large apps quickly). It's running all the time. It can also run LLMs when I want to.

The rest of my life is ultra-frugal so I am relaxed about this.

_puk a day ago [ - ]

Don't bite. You're right.

Having spent a good weekend learning how to perform latent-steering through playing with pytorch and a local Gemma4 model, there is no way I could have groked any of that in the the way I did without hands on time.

This is on an M3 Max 36GB I've had for a couple of years. No further outlay needed.

monkmartinez a day ago [ - ]

My thinking is totally aligned with yours, perhaps its because I am trying to do a second act at almost 50 from blue-collar to white collar office work. I have no formal degree, but I have been hobby programming for 20 years. I have made a habit of "letting myself be available to all lessons"... the localllama group has made this journey really fun if nothing else. I have learned an ABSOLUTE ton from this era!

dofm a day ago [ - ]

I have been contemplating a move in the opposite direction because I have just been exhausted and depressed, so for me, really learning this stuff this way has been about managing those feelings, about a sense of pride and ownership of my processes.

I don't know if it has changed my mind about a career change but as I am sure you can understand, I no longer feel like I am running away defeated.

My very best wishes to you :-)

moffkalast a day ago [ - ]

People pay thousands for model trains, everyone needs a hobby.

dofm a day ago [ - ]

Training models vs modelling trains

moffkalast 11 hours ago [ - ]

Ah yes, the EMD0E9-30B-Union-Pacific.gguf

fragmede 7 hours ago [ - ]

I'm sending Codex-gpt-5.5-cyber.stl to my printer right now!

Shorel 2 hours ago [ - ]

From your post I can only perceive the instinct to pick a side, and trying to make sure it is the "winning side". But the truth is far more nuanced. I have acces to both, paid and local models, and even if slower, the local models have been far more educative about how these technologies are put together, and what is required for local computing to thrive again. Paid models will not suddenly disappear just because I play with glm-4.6 on Ollama. At the same time, my work pays the cloud subscription and I use the cloud models to perform the tasks my work requires. There's no need to choose one side.

sanderjd a day ago [ - ]

> currently

The interesting question is whether that gap will narrow, and if so, how much, and on what timescale.

The exact answer to this question is not knowable, but if you are the kind of person who comes to a site called "hacker news", and you think there is a nonzero chance that the answer is that yes, the gap will narrow and this won't always be an expensive toy, then now seems like a pretty great time to get in the game and start exploring the capabilities.

Abishek_Muthian 17 hours ago [ - ]

I agree completely. I think local AI is best limited to purpose built SLMs; all this craze around running quantized coding LLMs has taken the attention off SLMs.

icedchai 4 hours ago [ - ]

Same. Local LLMs are fun to experiment with, but when I want generated code of a sufficient quality, I use a cloud LLM.

AlpacaJones a day ago [ - ]

The key word there is 'currently'.

smt88 a day ago [ - ]

Economies of scale are a fact of nature and aren’t going to be subverted in the future by even the most advanced local models

kennywinker a day ago [ - ]

Which is of course why, if you want to render 3d scenes to play a video game, you have to rent time on a mainframe system. I don’t see that changing ever - it’s just economies of scale!

(sarcasm, btw)

Gigachad 21 hours ago [ - ]

The economies of scale gains are lost because you still have a middle man hosting provider who wants to profit too.

Over the long term it's always been better to buy than to rent, even if the renting option is technically more efficient on the GPUs, you don't have to pay some hosting providers profit margin.

Dylan16807 12 hours ago [ - ]

If the hosting provider can fit 1000 users onto 100 GPUs, that's enough for quite nice margins and being far cheaper than buying your own GPU.

And for users that aren't running multiple agents 24/7, you should be able to fit a good user:GPU ratio.

Gigachad 11 hours ago [ - ]

Maybe. The economics work out better than for game streaming. When I looked in to game streaming it ended up being cheaper to buy over the long term. Though games tend to use 100% of the hardware for hours, and they tend to all be used at the same hours of the day and have to be hyper local for latency reasons. Something LLMs don’t have issues with.

oceanplexian a day ago [ - ]

Things can get both more expensive and cheaper at scale, hence the term.

For example (and relevant to AI) I can generate electricity on my roof at $0.20-25/kWh, batteries included. In California the electric utility can’t offer it cheaper than $0.30-0.50/kWh. Therefore at scale, electricity is actually more expensive.

There are many such examples.

Dylan16807 12 hours ago [ - ]

Apples and Oranges. The utility uses a weird conflated fee that combines the price of the electricity and the price of connecting your house to the grid. If they split it up your marginal price per kWh would be much less.

sanderjd a day ago [ - ]

Yeah, I think the fallacy here is the conflation of scale and centralization.

Right now, there is way more scale in centralized AI than there is at the edge. But that could flip. I'd still probably put the probability that it will under 50%. But I'd also put it above zero!

KingMob 8 hours ago [ - ]

Setting aside that very little about economics rises to the level of "facts of nature" like physics...

What makes you so certain that economies of scale won't work the opposite way you imagine? E.g., if model improvement tapers off, but RAM costs decline (hard to believe atm, but historically likely), then eventually everyone will be able to run SOTA models on their personal hardware.

Heck, even if model sizes simply grow more slowly than RAM costs decrease, the same would happen.

sanderjd a day ago [ - ]

... said the IBM executive to a young Bill Gates.

bogeholm a day ago [ - ]

> Cloud models […] don't consume so much power/generate heat

I do realize the cloud is just someone else’s computer right? Power goes in, tokens and heat come out - just in another place

actionfromafar a day ago [ - ]

The cloud computers produce more tokens per watt. That said, if you have a computer at home running 24/7 for other reasons and you also can use it for some LLM work, why not.

psychoslave a day ago [ - ]

Anything done local will likely come at higher cost and at scale with less energy efficiency and commodity, with less possibility to fine tune engineer deeply on wider horizon of issues.

That's never the point of keeping local alternatives though.

dofm a day ago [ - ]

Right.

For me this dates all the way back to installing Slackware 1.0 (0.99pl12!) on an offline 486SX rather than just using the internet-connected workstations in the lab.

Here, I already had a Mac that was powerful enough to run a local LLM, so now I do, because I can.

a day ago [ - ]

[deleted]

a day ago [ - ]

[deleted]