Best way to think of it is this: Right now you are not the customer. Investors are.

The money people pay in monthly fees to Anthropic for even the top Max sub likely doesn't come closer to covering the energy & infrastructure costs for running the system.

You can prove this to yourself by just trying to cost out what it takes to build the hardware capable of running a model of this size at this speed and running it locally. It's tens of thousands of dollars just to build the hardware, not even considering the energy bills.

So I imagine the goal right now is to pull in a mass audience and prove the model, to get people hooked, to get management and talent at software firms pushing these tools.

And I guess there's some in management and the investment community that thinks this will come with huge labour cost reductions but I think they may be dreaming.

... And then.. I guess... jack the price up? Or wait for Moore's Law?

So it's not a surprise to me they're not jumping to try and service individual subscribers who are paying probably a fraction of what it costs them to the run the service.

I dunno, I got sick of paying the price for Max and I now use the Claude Code tool but redirect it to DeepSeek's API and use their (inferior but still tolerable) model via API. It's probably 1/4 the cost for about 3/4 the product. It's actually amazing how much of the intelligence is built into the tool itself instead of just the model. It's often incredibly hard to tell the difference bertween DeepSeek output and what I got from Sonnet 4 or Sonnet 4.5

I've been playing around with local LLMs in Ollama, just for fun. I have an RTX 4080 Super, a Ryzen 5950X with 32 threads, and 64 GB of system memory. A very good computer, but decidedly consumer-level hardware.

I have primarily been using the 120b gpt-oss model. It's definitely worse than Claude and GPT-5, but not by, like, an order of magnitude or anything. It's also clearly better than ChatGPT was when it first came out. Text generates a bit slowly, but it's perfectly usable.

So it doesn't seem so unreasonable to me that costs could come down in a few years?

It's possible. Systems like the AMD AI Max 395+ with 128GB RAM thing get close to being able to run good coding models at reasonable speeds from what I hear. But, no, I'm given to understand they couldn't run e.e. the DeepSeek 3.2 model full size because there simply isn't enough GPU RAM still.

To build out a system that can, I'd imagine you're looking at what... $20k, $30k? And then that's a machine that is basically for one customer -- meanwhile a Claude Code Max or Codex Pro is $200 USD a month.

The math doesn't add up.

And once it does add up, and these models can be reasonable run on lower end hardware... then the moat ceases to exist and there'll be dozens of providers. So the valuation of e.g. Anthropic makes little sense to me.

Like I said, I'm using the Claude Code tool/front-end pointing against the page-per-use DeepSeek platform API, it costs a fraction of what Anthropic is charging, and feels to me like the quality is about 80% there... So ...

> But, no, I'm given to understand they couldn't run e.e. the DeepSeek 3.2 model full size because there simply isn't enough GPU RAM still.

My RTX 4080 only has 16 GB of VRAM, and gpt-oss 120b is 4x that size. It looks like Ollama is actually running ~80% of the model off of the CPU. I was made to believe this would be unbearably slow, but it's really not, at least with my CPU.

I can't run the full sized DeepSeek model because I don't have enough system memory. That would be relatively easy to rectify.

> And once it does add up, and these models can be reasonable run on lower end hardware... then the moat ceases to exist and there'll be dozens of providers.

This is a good point and perhaps the bigger problem.

[deleted]

You are bang on.

Every AI company right now (except Google Meta and Microsoft) has their valuations based on the expectation of a future monopoly on AGI. None of their business models today or in the foreseeable horizon are even positive let alone world-dominating. The continued funding rounds are all apparently based on expectation of becoming the sole player.

The continuing advancement of open source / open weights models keeps me from being a believer.

I’ve placed my bet and feel secure where it is.