It's about scalability. If you have 100 instances you really want them to share the cache so you increase hitrate and keep egress costs low.

> If you have 100 instances you really want them to share the cache

I think that assumes decoupled compute and storage. If instead I couple compute and storage, I can shard the input, and then I won't share the cache across the instances. I don't think there is one approach that wins every time.

As for egress fees, that is an orthogonal concern.