When I was working at sfcompute prior to this we saw multiple datacenters literally catch on fire bc the industry was not experienced with the power density of h100s. Our training chips just aren't a standard package in the way JBODs are.

Isn't the easy option to spread the computers out, i.e. not fill the rack, but only half of it?

A GPU cluster next to my servers has done this, presumably they couldn't have 64A in one rack so they've got 32A in two. (230V 3phase.)

Rackspace is typically at a premium at most data centers.

My info may be dated, but power density has gone up a ton over time. I'd expect a lot of datacenters to have plenty of space, but not much power. You can only retrofit so much additional power distribution and cooling into a building designed for much less power density.

This is my experience as well. We have 42u racks with 8 machines in them because we cant get more power circuits to the rack.

yep this was the case for us.

I'm more surprised that a data centre will apparently provide more power to a rack than is safe to use.

Adding the compute story would be interesting as a follow up.

Where is that done? How many GPUs do you need to crunching all that data. Etc.

Very interesting and refreshing read though. Feels like what Silicon Valley is more about than just the usual: tf apply then smile and dial.