They are scaling up, but most will only come online in end 2027-2028 time frame. And Memory, as in what we use in PCs is easier to manufacture then HBM memory. But all the money is in HBM ...
So for every ~4GB of memory that you can produce in normal DDR5, you can only make 1GB of HBM. But you make multiple times the revenue.
The demand for HBM memory is not going to go away. LLMs are memory bandwidth hungry, and we are going to see production going to AI. But also to "lower end" like B200's.
That means, they are producing multiple times less memory (if we look for the normal market demand), but still need to produce more for the memory bandwidth hungry market.
We are seeing more products entering the "prosumer/business" market that are also memory bandwidth hungry. This demand will not go away. It will actually increase as companies move to more localized workloads. There is is a issue with data privacy that a lot of companies legally deal with.
The lacking ramp up is not a sign of them being scared of over production, its a realization that 3 companies hold the market in a strangle hold, and "slow" scale. If everybody plays friendly, they can milk this for years.
China is a solution but China does not have the HBM production levels, and will take years to scale and put a dent in the market. And China is ... allocating a lot to domestic production of AI > HBM ...
The reality is, that unless competition ( as in China ) does not start scaling beyond the expected levels, the big 3 have no reason to scale too fast.
And money is not the issue ... have you seen their revenue (and net profit!! ) numbers. A few billions is peanuts for them at this point. They simply do not want to scale too fast because that means less milking ... Memory demand is not going to away. When people talk about the AI bubble popping, its more in terms of the stock market. The product is here and not going away.
This does suggest a path to improvement, though. A significant factor in the demand for HBM is how expensive the actual GPU chips are, making you want to use the absolute best memory to support them. When there's more competition in GPUs and the memory is actually a lot of the total price, you see things like Apple silicon with LPDDR5 being very popular. You can get a lot of bandwidth out of normal memory if you put in 256 or 512 bit bus. If we can get more midrange competition, we can focus more manufacturing capacity back on some form of DDR, and lessen the squeeze.
until china reveals a fab opening up next week.