> I hate being told what technology I can and can't use

Ever since the original GPT-2 "it's too powerful to release!" I've realized that whatever is the current state of open models represents what we really have access to.

It's shocking to me how many people on HN, who engage in long conversations about LLMs and AI, have never actually run a model on their own hardware.

All you need is a reasonably good macbook pro/studio or an RTX [3-5]090 and you can run useful models in the >= 30 tokens/second range (much higher if you choose the GPU path). The difference between what you can run on this hardware and what you can run on hardware that costs 2-5x is not that big. Don't be fooled by people on Twitter/X claiming you need some outrageous setup.

It's also increasingly clear that frontier models are nowhere near close to pushing the limits of efficiency. Quantization, MoE, and other techniques have dramatically improved even in the last year.

For work, of course use OpenAI/Anthropic models, but for anything personal, anyone who considers themselves a "real engineer" should be running local models, using open harnesses and seeing what they can accomplish with these.

Even if open releases slow down or even stop, we have the foundation, right now, for smart engineers to squeeze something quite useful out of. Hopefully we'll one day figure out how to train large models in a federated way. But either way: not your weights, not your inference.

lets be genuine here: those local models are no where near the capabilities of true modern llms like codex 5.5 and fable 5

but i also dont doubt in a few years time models with those benchmarks will be able to be run locally

still many many breakthroughs to be had

Personally I am fine with the SOTA from last year if I can run it on my hardware and who gets access to my data and history. I don’t really care that it could be marginally better using a model I cannot control on someone else’s server.

I am not fine with that. I have ranted about this before, but until recently the sota models were not intelligent enough for most of my work.

Yesterday Fable 5 finally solved a non trivial problem I had (after working on it for a few hours), and I went to bed excited. Waking up to find that Fable 5 is not available anymore, I guess I should feel happy I have the code it produced yesterday. But frankly I feel like a child having their candy taken from them.

We need open available models as smart as the current US proprietary ones. If intelligence like that becomes common property, i forsee a better future for human kind!