>I see so many people push AWS setups not because it's the best thing - it can be if you're not cost sensitive - but because it is what they know and they push what they know instead of evaluating the actual requirements.
I kinda feel like this argument could be used against programming in essentially any language. Your company, or you yourself, likely chose to develop using (whatever language it is) because that's what you knew and what your developers knew. Maybe it would have been some percentage more efficient to use another language, but then you and everyone else has to learn it.
It's the same with the cloud vs bare metal, though at least in the cloud, if your using the right services, if someone asked you tomorrow to scale 100x you likely could during the workday.
And generally speaking if your problem is at a scale where baremetal is trivial to implement, its likely we're only taking about a few hundred dollars a month being 'wasted' in AWS. Which is nothing to most companies, especially when they'd have to consider developer/devops time.
> if someone asked you tomorrow to scale 100x you likely could during the workday.
I've never seen a cloud setup where that was true.
For starters: Most cloud providers will impose limits on you that often means going 100x would involve pleading with account managers to have limits lifted and/or scrounding a new, previously untested, combination of instance sizes.
But secondly, you'll tend to run into unknown bottlenecks long before that.
And so, in fact, if that is a thing you actually want to be able to do, you need to actually test it.
But it's also generally not a real problem. I more often come across the opposite: Customers who've gotten hit with a crazy bill because of a problem rather than real use.
But it's also easy enough to set up a hybrid setup that will spin up cloud instances if/when you have a genuine need to be able to scale up faster than you can provision new bare metal instances. You'll typically run an orchestrator and run everything in containers on a bare metal setup too, so typically it only requires having an auto-scaling group scaled down to 0, and warm it up if load nears critical level on your bare metal environment, and then flip a switch in your load balancer to start directing traffic there. It's not a complicated thing to do.
Now, incidentally, your bare metal setup is even cheaper because you can get away with a higher load factor when you can scale into cloud to take spikes.
> And generally speaking if your problem is at a scale where baremetal is trivial to implement, its likely we're only taking about a few hundred dollars a month being 'wasted' in AWS. Which is nothing to most companies, especially when they'd have to consider developer/devops time.
Generally speaking, I only relatively rarely work on systems that cost less than in the tens of thousands per month and up, and what I consistently see with my customers is that the higher the cost, the bigger the bare-metal advantage tends to be as it allows you to readily amortise initial setup costs of more streamlined/advanced setups. The few places where cloud wins on cost is the very smallest systems, typically <$5k/month.
> if your using the right services, if someone asked you tomorrow to scale 100x you likely could during the workday.
"The right services" is I think doing a lot of work here. Which services specifically are you thinking of?
- S3? sure, 100x, 1000x, whatever, it doesn't care about your scale at all (your bill is another matter).
- Lambdas? On their own sure you can scale arbitrarily, but they don't really do anything unless they're connected to other stuff both upstream and downstream. Can those services manage 100x the load?
- Managed K8s? Managed DBs? EC2 instances? Really anything where you need to think about networking? Nope, you are not scaling this 100x without a LOT of planning and prep work.
> Nope, you are not scaling this 100x without a LOT of planning and prep work.
You're note getting 100x increase in instances without justifying it to your account manager, anyway, long before you figure out how to get it to work.
EC2 has limits on the number of instances you can request, and it certainly won't let you 100x unless you've done it before and already gone through the hassle to get them to raise your limits.
On top of that, it is not unusual to hit availability issues with less common instance types. Been there, done that, had to provision several different instance types to get enough.
I hit it quite frequently with a particularly popular eks node instance type in us-east-1 (of course). I’m talking requesting like 5-6 instances, nothing crazy. Honestly, I wonder if ecs or fargate have the same issue.
So, I was around back then and am around now as a principal and this comment doesn't really pass the reality sniff test.
Its a lot worse than this in terms of AWS cost for apps that often barely any people use. They're often incorrectly provisioned and the AWS bill ends up in the hundreds of thousands or millions and could have been a few thousand in bare metal on Hetzner with a competent sysadmnin team. No, its not harder to administer bare metal. No, its not less reliable. No, its not substantially harder to scale for most companies to do bare metal(large fortune 50 excluded).
I've been seeling a cost-reduction service for a while, and the hardest aspect of selling it is that so many people on the tech side doesn't care because they don't seem to be held to account to the drain they cause.
I can go in and guarantee that my fees are capped at a few months worth of their savings, and still it's a hard sell with a lot of teams who are perfectly happy to keep burning cash.
And I'll note, as much as I love to get people off AWS, most of the times people can massively reduce their bill just by using AWS properly as well, so even if bare metal was bad for their specific circumstances they're still figuratively setting fire to piles of cash.