This design feels very obvious-in-hindsight. Consolidate power adapters and networking, replace cabling with pluggable slots. It's something similar to what IBM mainframes or Sun cabinets could've looked like. Somehow hardware giants like Dell, HP, SuperMicro, etc didn't make a product like this, even at their peak in 2000s or during cloud boom in 2010s. I wonder why?
Beautiful machine, and fun to see Illumos heart still beating inside!
> Somehow hardware giants like Dell, HP, SuperMicro, etc didn't make a product like this, even at their peak in 2000s or during cloud boom in 2010s.
Not so sure about this one. HCI (Hyperconverged) rack units (where storage and compute live in the same racked systems) and "blade servers" have been a thing for a really long time now; compute sleds aren't what's novel here.
Rack-level DC conversion is also not particularly novel, although underutilized IMO. It was pretty popular in HPC style density applications for awhile (see HP/SGI Altix 4000 for a good old example).
What's unique about Oxide is that they went all the way down to the firmware and then back up, rather than doing commodity hardware integration or reselling - for example, you can get something like a Supermicro EVO:Rail, but it will be running VMWare, not a fully integrated platform.
The big difference that everyone is missing in this subthread is that Oxide is about the hardware and the software.
There are systems which have similar overall hardware designs, but they are usually integrating a large amount of hardware and software from multiple vendors. Oxide is much closer to "everything is produced by Oxide."
I wrote this back in 2022, and it's still fundamentally relevant today: https://news.ycombinator.com/item?id=30678324
I wish I could edit my post.
Somehow everyone wrote to me about baldes. These are not the same, though. Blade servers were mounded into units of 4u, 8u, etc, they occupied a portion of the overall cabinet and still had to do "plumbing" for power and networking behind the chassis to the rest of the cabinet or to the rest of the datacenter. A full-cabinet blade rig would have multiple 8u blade units and some off the shelf units for networking, storage, etc. Yes, you could mix and match different components based on your needs, but that also meant that there were extra wires, cables, mounting rails, and more importantly - all these different components ran a mix of software that had to integrate using common denominator protocols and speeds.
Steve rightly mentioned the integration below, and I didn't put it in my message because I kinda assumed that we include software in this discussion too.
HP in 2005 had an army of programmers writing all sorts of firmware and software and another army of hardware engineers, too. They could have made an Oxide computer back then, and it would sell really well. But they didn't, and none of their competitors did despite this being an obvious product (in hindsight), an THIS is what I find interesting.
Cabling, plumbing, etc, aside, all the "blade servers" I've ever worked with were still glorified IBM PC's. They still have BMC's strapped to legacy interfaces pretending to be decades-old hardware allowing for "headless" operation of a platform originally intended to be a single-user computer on a desk with a monitor and keyboard, etc.
That's what's so cool about Oxide's boxes to me-- the legacy garbage is gone and the strange undefined behavior part and parcel with overlapping edge cases will be minimized (and managed, as opposed to used as an excuse by a vendor). Dealing with incompatibilities and strange firmware interactions have made me come to see PC-based servers as a weird opposite of the "Swiss cheese" model. The various layers of interacting hardware, firmware, drivers, and OS act as a kind of "filter" for correct operation. When you swap or add one of these component you get one or more exciting new layers in the stack that, hopefully, have "holes" aligning with the existing.
FWIW the original IBM BladeCenter platform was developed before BMCs so it had a much smaller service processor per blade connected over an out of band RS-485 management bus to a chassis management module. It was closer to Oxide than what's being sold today. And then "Windows needs VGA" kicked in.
IBM gear, aside from Thinkpads and the EduQuest PCs, have never been a thing I've gotten to interact with. Their documentation was always very good (which I think implies a certain level of overall competence) Given their massive legacy in datacenters I can imagine their PC-based server gear probably benefited.
Forgot how I learned this but IIUC, blades failed because they expended per-rack weight and power budgets for datacenters with single enclosures. It turned out that you could compress 48U worth of computers into 8U or so, as long as you do not fill back the empty space with anything, because the cooling/cabling crawlspace collapses and circuit breakers will go off if you had done so. It wasn't because they still needed cabling.
This sounds like less of a problem for DCs with bare concrete flooring, but blades did fell out of fashion, so I guess the fractions of DCs with multiple levels or free-access floors were higher than anticipated.
(Also, maybe I'm just being an amateur, but I'd be scared of tolerance stacking with a "grape bunch" design like this. Individually enclosed chassis with cables and cage nuts are a lot more robust against dimensional issues)
Seems to me that making it hyper-converged, with all storage make it make sense. Ultra dense compute alone isn't ideal. Oxide rack has the features of blade but is hyperconverged with everything. Plus the better integrated software.
Speaking of the army, it's not clear that the extremely narrow feature set of Oxide justifies the engineering effort required.
The storage market used to be dominated by Oxide-style vertical integration and bespoke engineering and almost every vendor has transitioned to modularity over the last 20 years. Pure Storage seems to be doing OK with custom hardware though, so maybe the rest of the industry just has a lack of courage.
> it's not clear that the extremely narrow feature set of Oxide justifies the engineering effort required.
Their customers think so.
(Think governments, security-serious applications, critical workloads, science, energy, financial, etc.)
Some if it may even get sent to labs first for disassembly and inspection. Other times it may never ever be plugged in to an internet connection.
For these customers, the feature set, supply chain ownership, integration and support is everything.
> Somehow hardware giants like Dell, HP, SuperMicro, etc didn't make a product like this,
Dell and HP both have "blades" that plugged into a blade-chassis. The chassis had all the lights out mgmt as well as power/networking integrated so the blade was basically a metal box with compute/memory/storage and it just slid in to the dock.
I am sure that supermico had something like this as well
Cisco does too and theres another hardware virtualization layer below the normal ones ( so for example you can have many virtual nics per actual nic, etc)
Cisco UCS! A great hardware platform, albeit quite expensive.
Blades basically died out is the thing - AFAIK no one really wanted them and honestly the same is a risk for what Oxide is doing too.
Blades have the basic issue of "how often do you want an unpopulated chassis?" - answer, never.
So really they're solving for replacing a failed piece of hardware.
But how often do you need to do that, what's it worth to you? If it makes sense then the statistical window where it does is tiny.
And if you own more then 1, like an entire rack, then do you even care? Because above some scale you're just going to wheel the rack out rather then go and pull individual units.
Basically the scaling is against you: for a highly manageable bladey rack unit, you've got to be small enough that one server matters, large enough you need the swap out to be low labor, but not so large you could just wait for the rack to go down. And this has to be worth enough to justify the price premium and vendor lock in (because at rack scale you just buy a rack of the cheapest whatever from any vendor and make them compete on price - at one job bringing our computer management in house triggered an immediate 10% price drop because we threatened HP with using another supplier at all).
You're right, even though I did have a good use-case for them. Back in 2000-~2020 I built a "boutique linux hosting" provider, and the Supermicro "Twin^2" servers really fit our use case. We were mostly serving small dedicated servers and were very price sensitive.
Loved the idea of blade servers, but they were targeted to people who needed very high compute in small footprints, and we both didn't have high compute requirements and were power/footprint constrained (we could get more power but cost/watt would go up because of cooling density).
The Twin^2 was nice because it amortized the cost of redundant power supplies over 4 machines, but didn't have the cost overhead of big backplanes or fancy layouts to get a lot of CPU+RAM in a small physical space.
Once populated a 4 node chassis was around $750/node including CPU and RAM and 2x SATA drives, it was within $100 of the price of a similar 1U server. We had around 10 cabinets in a data center when I left the company. It was, IMHO, a pretty good deal to get a dedicated box with 24x7 monitoring and sysadmin services including updates and backups at $150/mo.
Aside: we were also one of the first VM providers, I see on the Wayback Machine we were offering it in Feb 2005, predating Digital Ocean by at least 6 years. I've regretted not marketing and selling that service much more widely. It was a side project and we had a lot of irons in the fire at the time, so we didn't focus on it very much.
It was implemented with User Mode Linux, a Linux kernel ported to run under Linux instead of ported to a bare machine. A crazy idea, but it worked REALLY well. I remember finishing up the sign-up and billing software on the plane on the way to US PyCon where we announced the service, though I don't remember the year.
> Basically the scaling is against you: for a highly manageable bladey rack unit, you've got to be small enough that one server matters, large enough you need the swap out to be low labor, but not so large you could just wait for the rack to go down. And this has to be worth enough to justify the price premium and vendor lock in (because at rack scale you just buy a rack of the cheapest whatever from any vendor and make them compete on price - at one job bringing our computer management in house triggered an immediate 10% price drop because we threatened HP with using another supplier at all).
Yep! That perfectly describes the few remaining people I know of that operate the things... and they're (slowly) seeing the light.
Oxide does get a bit of a pass on the vendor lock-in, though. I think you're buying from them _because_ they are the only vendor that has the security model and level of integration.
I work in HPC and at some point we had a dozen or so racks with blade systems in our cluster. IIRC it was HP c7000 blade enclosures, 16 nodes in a 10U chassis. We had 4 such chassis in each rack. So reasonably dense, and there was a bit less cabling compared to individual servers.
OTOH, much of the cost saving of less cabling was eaten up by the vendor charging higher prices for equipment like HCA's or switches compatible with the blade enclosure. And unless you went for a fully non-blocking IB fabric there were a bunch of unused IB switch ports.
Also, while the blade enclosure had this fancy web GUI for management, at scale we had built our OOB management automation around IPMI anyway, so this wasn't a feature worth much for us. If anything it was a bit of a chore, as in the cases when we needed to do something which IPMI wasn't capable of, there was an extra step of figuring out the node->chassis mapping to know which chassis to connect to, and then figuring out which blade in the chassis corresponded to the node in question.
For the next generation we got these "twin" systems manufactures had started coming out with, with 4 nodes in a 2U chassis. A bit more cabling than the blade systems, but in the end it was somewhat cheaper.
> Blades basically died out is the thing - AFAIK no one really wanted them and honestly the same is a risk for what Oxide is doing too.
To me, from someone that has worked for orgs that either would have been or are customers of Oxide - You need to be thinking more about the complete package. You are thinking about a tiny piece.
Meanwhile a Hetzner rack: https://i.redd.it/u3o410rwdt6h1.jpeg
Blade servers have been doing this for 25+ years.
The vendors did make blades in the 2000s.
I thought surely this isn't just blade servers, that those compute shelves were full of GPUs or something novel, but no just blades reincarnated. I used to support HP's baby version of this, the c3000.
Also, a big cabinet into which you plug varying amounts of hardware capacity, then use the control plane to partition into various virtual resources, describes at least at the conceptual level IBM going back decades.
> Somehow hardware giants like Dell, HP, SuperMicro, etc didn't make a product like this, even at their peak in 2000s or during cloud boom in 2010s. I wonder why?
They all did. HP had Super Dome and blades and Synergy. Dell had similar.
I learned about blade servers back in ~2010 because Blizzard used to run World of Warcraft realms on them and auctioned them off for charity.
https://warcraft.wiki.gg/wiki/Server_blade
https://en.wikipedia.org/wiki/Blade_server