> The interesting problem is noticing than and then informing every proxy in the world.
Yes and that is why I suggested why your any-to-any relationship of proxy to application is a decision you have made which is part of the painpoint that caused you to come up with this solution. The fact that any proxy box can proxy to any backend is a choice which was made which created the structure and mental model you are working within. You could batch your proxies into say 1024 cells and then assign a customer app to say 4/1024 cells using shuffle sharding. Then that decomposes the problem into maintaining state within a cell instead of globally.
Im not saying what you did was wrong or dumb, I am saying you are working within a framework that maybe you are not even consciously aware of.
Again: it's the premise of the platform. If you're saying "you picked a hard problem to work on", I guess I agree.
We cannot in fact assign our customers apps to 0.3% of our proxies! When you deploy an app in Chicago on Fly.io, it has to work from a Sydney edge. I mean, that's part of the DX; there are deeper reasons why it would have to work that way (due to BGP4), but we don't even get there before becoming a different platform.
I think the impedance mismatch here is I am assuming we are talking about a hyperscaler cloud where it would be reasonable to have say 1024 proxies per region. Each app would be assigned to 4/1024 proxies in each region.
I have no idea how big of a compute footprint fly.io is, and maybe due to that the design I am suggesting makes no sense for you.
The design you are suggesting makes no sense for us. That's OK! It's an interesting conversation. But no, you can't fix the problem we're trying to solve with shuffle shard.