Ok heres the thing you will nevwr be able to truly do this due to logic.
Logically five people pooling their resources beats one guy.
therefore datacenters will always win because they get higher time utilization.
so forget it.
I always wonder the same but i let logic tell me its a fantasy, on average you cant outspend a whole group of people making better use of the hardware.
you will get better hardware though, cutting edge will always be cloud
Laptops/desktops are cheaper per flop than any datacenter hardware by a good order of magnitude.
The problem is that expectations rise in datacenters, hardware/power/security/availability guarantees cost real money. Then the operator providing these guarantees expects some margin.
You can see this most clearly with "developer desktops", a gcp instance costs about 10x a hetzner instance which costs between 5 and 10x the same hardware sitting in the back of an office somewhere. While all of these premiums matter for 24/7 systems under active development, they don't really matter for ephemeral small scale workloads.
Doesn’t it flip around for small scale? Paying 100x the cost for something, all in, it’s cheaper to rent for small workloads like 10m/day.
At 10x you have to be at hours per day and 5x you’re at 4h.
Actually they wouldnt spend the money if it were cheaper.
HBM has way higher bandwidth and its not all about flops.
Also the FP4 flops (inference) are so mind bogglingly high on these things.
Lastly what you fail to consider is the chip to chip bandwidth which is critical.
the people running these know that networking is just as critical.
all reduce etc.
they wouldnt pay if they could get something better value.
Just like cloud is "cheaper" than colo/metal, right?
> cutting edge will always be cloud
Don't think anyone was refuting that?
And of course when you pool resources you have access to more resources.
They just mean this part: "where I upgrade hardware in order to upgrade my ai as an alternative to an expensive subscription."
Upgrading local hardware will remain the more expensive alternative to the subscription regardless what the relative cost of running the models themselves are. If the local hardware to do so becomes affordable then the subscription will be even more affordable, not expensive.
At least for these kinds of mega tasks. For more micro task we will always end up with unutilized local compute we already purchased which will be "free" since we already paid for non-AI reasons (e.g. a gaming GPU while not gaming).