I think the impedance mismatch here is I am assuming we are talking about a hyperscaler cloud where it would be reasonable to have say 1024 proxies per region. Each app would be assigned to 4/1024 proxies in each region.
I have no idea how big of a compute footprint fly.io is, and maybe due to that the design I am suggesting makes no sense for you.
The design you are suggesting makes no sense for us. That's OK! It's an interesting conversation. But no, you can't fix the problem we're trying to solve with shuffle shard.