Pay $0 to run a local model or even a cheap DeepSeek V4 model via their API which is close to free per million tokens.

These prices are just going to get raced to $0.

I used to have a $20/mo ChatGPT subscription and now I spend $12 per year using Kimi models on OpenRouter, and that's with zero-data-retention-only providers (some models sometimes have free providers with scary tracking). Maybe I just don't use that many tokens, I don't fill the context with more than what's needed for a specific request, but it goes to show how these subscriptions can be an absolute ripoff. The thought of spending 200x that is insane to me

The beauty of your approach: when people are not paying for an expensive subscription, they can decide to use models less and not feel like they are leaving money on the table.

Maybe. But for now it's fascinating how $200/month has kind of become a normal tier.

It's similar to how AirPods normalised all of us having $300+ headphones. All of us would have scoffed at the idea a decade ago.

Many people here spent a lot more than $300 on headphones long before AirPods appeared.

Those were hobbyists, audiophiles, professionals, artists (recording, performing, etc.).

They are talking about a much larger group of people.

I think OP meant noise-cancelling headphones, which were fairly ubiquitous in tech circles in open offices; before Apple launched AirPods.

Airpods Inc. would be very high up SP500 as a standalone business.

I had a really nice Sennheiser before that, too. But now you hop on the subway and everybody sports one.

But, it is not all about cost: models like DeepSeek v4 flash (I use the US company Fireworks.ai and also buy tokens directly from DeepSeek) is very fast, very low latency while working.

Would you want to use a text editor that updates the screen very slowly? Kind of the same thing for using agentic systems as coding assistants: don’t want a ‘sluggish’ experience.

I have, mostly, long running autonomous tasks, so it doesn't matter how slow inference is. If I optimize for latency it means I'm turning into the limiting factor.

The Sony WH-1000XM series and the Bose QC35 were the standard quality headphones years before AirPods were a thing, and both retailed at $300+.

Of course, premium headphones existed before. I have a WH-1000XM4 sitting right next to me.

But your aunt Josie didn't have one. Now Apple is selling 80 million units / year and the ~$300 price tag has become normal. Before that, most people had headphones that were 10 times cheaper.

$300 isn’t what AirPods cost though. You can get a pair of AirPods 4 for $129 on Apple.com, and I presume that is still the most popular model. If you’re paying ~$300, you are buying premium headphones.

The base model where I live (Central Europe) is $194. The Pro is $357. The Max is $779.

I just averaged it out.

Not everyone can run local models. It is also expensive will be outdated soon as the model evolves.

Not while the hardware required to run a local model at an acceptable speed costs way more than $200.

Guess what, the big players are hoarding all the RAM and GPUs so that other people can't afford decent hardware. It's working out beautifully for them!

> Not while the hardware required to run a local model at an acceptable speed costs way more than $200

It's $200/month. You have to take into account energy costs and all the rest of a system, but if you break even within 1-2 years ($2400-$4800) it'd be a pretty good deal. And $4000 buys you a pretty decent system.

Sure, if you're going to keep using it long term.

But it's a hefty upfront investment for people who just want to experiment. The good thing about $200/month subscriptions is that you can cancel them any time and cut your losses. Not so with a $4000 computer that loses half of its resale value as soon as you plug it in.

I think the current sweet spot for people who don't already own a high-end gaming PC is to rent a server with a beefy GPU from Hetzner et al. and run local models there.