Something I haven’t been able to reconcile: If AI makes software easier to create, that will drive the price down. How are software companies going to make enough revenue to pay for AI, when the amount of money being spent on AI is already multiples of the current total global expenditure on software? This demand for RAM is built on a foundation of sand, there will be a glut of capacity when it all shakes out.
> This demand for RAM is built on a foundation of sand
RAM is built on a foundation of sand.
I laughed.
Analyst says it would require a new $35/month subscription for every iPhone user, or a new $180/month for every Netflix subscriber.
Claude Max subscriptions have gone up, but do you think every Netflix user will pay for one?..
https://www.tomshardware.com/tech-industry/artificial-intell...
> This demand for RAM is built on a foundation of sand
Not exactly.
LLMs are already quite useful today if you use them as a tool, so they are there to stay. The remaining problem is scalability, a.k.a. how to make LLMs cheap to use.
But scalability is not really a requirement when you look the bigger picture. If smaller software company/projects can't afford to use AI, the bigger ones might just. Eventually they will discover variable use cases for such tech, even if it only serves big firms i.e. defense, resource extraction, war, finance etc.
To the other end, if scalability is achieved, the use of LLM products will be cheaper too, so smaller project can also use them. But of course, if LLM usage is too cheap, then many were-to-be-consumers will just create software projects by themselves at their homes.
Would they be considered as useful if users are required to pay non-subsidized price though.
Everyone's betting on Jevons paradox
the hope is that Ai is "the next semiconductor" and "the next internet"
The usage of LLMs is continuing to increase ~exponentially. I'm going to bet on that rather than some half-baked scenario analysis that only takes into account one scenario and assigns a 100% probability to it.
> The usage of LLMs is continuing to increase ~exponentially
I would like a source for that statement. Additionally, I want to know by who? Because it certainly isn't end users. Inflating token usage doesn't make it any more economically viable if your user base, b2b or not, hasn't increased with it. On the contrary, that is a worse scenario for providers.
> I would like a source for that statement
The recent enterprise revenue numbers of Anthropic
So as I said, a self interested metric who also controls how many tokens it takes to get a desirable result from their models.
Users are willingly paying for larger volumes of tokens. You are layering your own unproven interpretation onto that. I would have arrived at an opposite interpretation given the available facts. Models are becoming more token efficient for the same task, such as ChatGPT 5.3 versus 5.2 which halved the token count, and capabilities show a log relationship with the number of tokens since o1 preview was revealed in September 2024.
No, you have gone off in your own tangent. The person you're responding to is talking about money and my point is that you're using a misleading metric. Even if the current user base is paying more for the "exponential token usage", it does not add up to the industry's cost of maintaining and building on this technology, especially since we are not taking into account what that token usage costs the provider. First you said Anthropic as your source, but now you're talking about OpenAI's ChatGPT, who are floundering for a product and user base, which they themselves claim will be profitable through subscriptions at numbers never seen before in a subscription business model.
> Additionally, I want to know by who?
1. As a consultant pretty much every company I have worked with in the last 2 years are doing some kind of in-house "AI Revolution", I'm talking making "AI Taskforce" teams, having weekly internal "AI meetings" and pushing AI everywhere and to everyone. Small companies, SMEs and huge companies. From my observation it is mainly due to C-level being obsessed by the idea that AI will replace/uplift people and revenue will grow by either replacing people or launching features 10x quicker.
2. Did you see software job-boards recently? 9/10 (real) job listings are to do with AI. Either it is fully AI company (99% thin wrapper over Anthropic/OpenAI APIs) or some other SME that needs some AI implementations done. It is truly a breath of fresh air to work for companies that have nothing to do with AI.
The biggest laugh/cry for me are those thin wrappers that go down overnight - think all the "create your website" companies that are now completely useless since Ahtropic cut the middleman and created their own version of exactly that.
It's a trojan horse. They expect the world will get hooked on it.
> If AI makes software easier to create, that will drive the price down.
Supposedly AI drives down the cost of producing software,not the "price".
> How are software companies going to make enough revenue to pay for AI, when the amount of money being spent on AI is already multiples of the current total global expenditure on software?
Currently, the cost of AI is between $20/month and around $200/month per developer.
I think the huge billions you're seeing in the news are the investment cost on AI companies, who are burning through cash to invest in compute infrastructure to allow both training and serving users.
> This demand for RAM is built on a foundation of sand, there will be a glut of capacity when it all shakes out.
Who knows? What I know is that I need >64GB of RAM to run local models, and that means most people will need to upgrade from their 8Gb/16GB setup to do the same. Graphics cards follow mostly the same pattern.
You need >64 GB of DRAM to run local models fast.
You can run huge local models slowly with the weights stored on SSDs.
Nowadays there are many computers that can have e.g. 2 PCIe 5.0 SSDs, which allow a reading throughput of 20 to 30 gigabyte per second, depending on the SSDs (or 1 PCIe 5.0 + 1 PCIe 4.0, for a throughput in the range 15-20 GB/s).
There are still a lot of improvements that can be done to inference back-ends like llama.cpp to reach the inference speed limit determined by the SSD throughput.
It seems that it is possible to reach inference speed in the range from a few seconds per token to a few tokens per second.
That may be too slow for a chat, but it should be good enough for an AI coding assistant, especially if many tasks are batched, so that they can progress simultaneously during a single read pass over the SSD data.
You can do that, but you're going to have rather low throughput unless you have lots of PCIe lanes to attach storage to. That's going to require either a HEDT or some kind of compute cluster.
Batching inferences doesn't necessarily help that much since as models get sparser the individual inferences are going to share fewer experts. It does always help wrt. shared routing layers, of course.
I got a 128 GB MBP, and the current models are fit enough to manage the calendar or do research on web (very slowly), not to be useful companions for coding as I hoped.
> Who knows? What I know is that I need >64GB of RAM to run local models, and that means most people will need to upgrade from their 8Gb/16GB setup to do the same. Graphics cards follow mostly the same pattern.
Depends how big the models are, how fast you want them to run and how much context you need for your usage. If you're okay with running only smaller models (which are still very capable in general, their main limitation is world knowledge) making very simple inferences at low overall throughput, you can just repurpose the RAM, CPUs/iGPUs and storage in the average setup.