Open weight models allow for repurposing existing hardware locally, and there's a lot of it around - far more than the amount of new RAM being supplied. So they add some short-term downward pressure to the price. (But not very much, since these datacenter builds are long-term investments that are targeted at eventually running far larger models.)

If regular people can repurpose old hardware, so can shared providers, who can extract more value from the hardware and thus afford to pay more.

In a constrained market, supply and demand favors folks who can most efficiently extract rent. Local models only make sense in a world with abundant compute and energy.