The solution is simple: budget caps.

Yes and no. 100% accurate billing is not available in realtime, so it's entirely possible that you have reached and exceeded your cap by the time it has been detected.

Having said that, within AWS there are the concepts of "budget" and "budget action" whereby you can modify an IAM role to deny costly actions. When I was doing AWS consulting, I had a customer who was concerned about Bedrock costs, and it was trivial to set this up with Terraform. The biggest PITA is that it takes like 48-72 hours for all the prerequisites to be available (cost data, cost allocation tags, and an actual budget each can take 24 hours)

The circuit breaker doesn’t need to be 100% accurate. The detection just needs to be quick enough that the excess operating cost incurred by the delay is negligible for Amazon. That shouldn’t really be rocket science.

We're talking about a $2.5T company. Literally every example in this thread is already negligible to Amazon already without circuit breakers.

Implementing that functionality across AWS would cost orders of magnitude more than just simply refunding random $100k charges.

The point is that by not implementing such configurable caps, they are not being customer friendly, and the argument that it couldn’t be made 100% accurate is just a very poor excuse.

Sure, not providing that customer-friendly feature bestows them higher profits, but that’s exactly the criticism.

They also refuse refunds. Because it is profitable, even if the customer is unhappy to pay it.

If it were highly profitable for them to implement some form of budget cap cutoffs, they would! It's obvious it's not a game they are interested in.

What about 90% accurate?

Is it simple? So what happens when you hit the cap, does AWS delete the resources that are incurring the cost and destroy your app?

Imagine the horror stories on Hacker News that would generate.

Stop accepting requests like has been the case since the beginning of time?

Or simply returns 503? Why would you go directly to destroying things??

Suppose you’re going over the billing cap based on your storage consumption, how would AWS stop the continued consumption without deleting storage?

Why would they need to delete storage, they could just not accept past the cap.

Storage billing is partly time-based.

EBS is billed by the second (with a one minute minimum, I think).

Once a customer hits their billing cap, either AWS has to give away that storage, have the bill continue to increase, or destroy user data.

I think most of the "horror stories" aren't related to cases like this. So we can at least agree most such stories could be easily avoided, before we looked at solutions to these more nuanced problems (one of which would be clearly communicating the mechanism of a limit and what would be the daily cost of maintaining the maxed storage - and for a free account the settings could be adjusted for these "costs" to be within free quota)

Not everything on AWS is a Web app

TCP session close? Don't reply back the UDP response? Stop scheduling time on the satellite transceiver for that account?

Interesting that you mention UDP, because I'm in the process of adding hard-limits to my service that handles UDP. It's not trivial, but it is possible and while I'm unsympathetic to folks casting shade on AWS for not having it, I decided a while back it was worth adding to my service. My market is experimenters and early stage projects though, which is different than AWS (most revenue from huge users) so I can see why they are more on the "buyer beware" side.

Everything on AWS can deny a request no matter what the API happens to be

While I can imagine having budget overload from storage, most (all?) of the "horrors" on the page are from compute or access.

Set it up so that machines are deleted, but EBS volumes remain. S3 bucket is locked-out but data is safe.

I mean, would you rather have a $10k build or have your server forcefully shut down after you hit $1k in three days?

One of those things is more important to different types of business. In some situations, any downtime at all is worth thousands per hour. In others, the service staying online is only worth hundreds of dollars a week.

So yes, the solution is as simple as giving the user hard spend caps that they can configure. I'd also set the default limits low for new accounts with a giant, obnoxious, flashing red popover that you cannot dismiss until you configure your limits.

However, this would generate less profit for Amazon et al. They have certainly run this calculation and decided they'd earn more money from careless businesses than they'd gain in goodwill. And we all know that goodwill has zero value to companies at FAANG scale. There's absolutely no chance that they haven't considered this. It's partially implemented and an incredibly obvious solution that everyone has been begging for since cloud computing became a thing. The only reason they haven't implemented this is purely greed and malice.

If you want hard caps, you can already do it. It’s not a checkbox in the UX, but the capability is there.

> Is it simple? So what happens when you hit the cap, does AWS delete the resources that are incurring the cost and destroy your app?

Sounds like you're saying "there aren't caps because it's hard".

> If you want hard caps, you can already do it. ... the capability is there.

What technique are you thinking of?

There are several satisfactory solutions available. Every other solution they offer was made with tradeoffs and ambiguous requirements they had to make a call on. It is obviously misaligned incentive rather than an impossibility. If they could make more money from it, they would be offering something. Product offering gaps are not merely technical impossibilities.

Yes, that’s exactly the expected behavior. It can alert if it’s closed to threshold. Very straightforward from my point of view.

Surely that's the fault of the purchaser setting the cap too low.

Maybe rather than completely stopping the service, it'd be better to rate limit the service when approaching/reaching the cap.

Using that logic, isn’t it the fault of the user to set up an app without rate limiting?

It's misleading to promote a free tier that can then incur huge charges without being able to specify a charge cap.

If it can incur any charges at all then it isn't free.