Sounds like DynamoDB is going to continue to be a hard dependency for EC2, etc. I at least appreciate the transparency and hearing about their internal systems names.

I think it's time for AWS to pull the curtain back a bit and release a JSON document that shows a list of all internal service dependencies for each AWS service.

Would it matter? Would you base decisions on whether or not to use one of their products based on the dependency graph?

It would let you know that if if service A and B both depend on service C you can't use A and B to gain reliability.

Yes.

if so, I hate to tell you this but you would not use AWS (or any other cloud provider)!

I don’t use AWS or any other cloud provider. I use bare metal since 2012. See, in 2012 (IIRC), one fateful day, we turned off our bare metal machines and went full AWS. That afternoon, AWS had its first major outage. Prior to that day, the owner could walk in and ask what we were doing about it. That day, all we could do was twiddle our thumbs or turn on a now outdated database replica. Surely AWS won’t be out for hours, right? Right? With bare metal, you might be out for hours, but you can quickly get back to a degraded state, no matter what happens. With AWS, you’re stuck with whatever they happen to fix first.

Meanwhile I've had bare metal be a complete outage for over a day because a backhoe decided it wanted to eat the fiber line into our building. All I could do was twiddle my thumbs because we were stuck waiting on another company to fix that.

Could we have had an offsite location to fail over to? From a technical perspective, sure. Same as you could go multi-region or multi-cloud or turn on some servers at hetzner or whatever. There's nothing better or worse about the cloud here - you always have the ability to design with resilience for whatever happens short of the internet on the whole breaking somehow.

I worked for AWS for two years and if I recall correctly, one of the issues was circular dependencies.

A lot of internal AWS services have names that are completely opaque to outside users. Such a document will be pretty useless as a result.

+1, SREs can spend months during their onboarding basically reading design docs and getting to know about services in their vicinity.

Short of publicly releasing all internal documentation, there's not much that can make the AWS infrastructure reasonably clear to an outsider. Reading and understanding all of this also would be rather futile without actual access to source code and observability.

They should at least split off dedicated isolated instances of DynamoDB to reduce blast radius. I would want at least 2 instances for every internal AWS service that uses it.

I mean, something has to be the baseline data storage layer. I’m more comfortable with it being DynamoDB than something else that isn’t pushed as hard by as many different customers.

The actual storage layer of DynamoDB is well engineered and has some formal proofs.