Frankly, any web app I develop has configurable in-memory caching built in to it, so I would rather increase its size than add an extrinsic cache. By keeping my cache internal to my application, it's also easier for me to invalidate keys accurately.

It's about scalability. If you have 100 instances you really want them to share the cache so you increase hitrate and keep egress costs low.

> If you have 100 instances you really want them to share the cache

I think that assumes decoupled compute and storage. If instead I couple compute and storage, I can shard the input, and then I won't share the cache across the instances. I don't think there is one approach that wins every time.

As for egress fees, that is an orthogonal concern.