> The bidding model is elegant, but it’s insufficient to route network requests. To allow an HTTP request in Tokyo to find the nearest instance in Sydney, we really do need some kind of global map of every app we host.

So is this a case of wanting to deliver a differentiating feature before the technical maturity is there and validated? It's an acceptable strategy if you are building a lesser product but if you are selling Public Cloud maybe having a better strategy than waiting for problems to crop up makes more sense? Consul, missing watchdogs, certificate expiry, CRDT back filling nullable columns - sure in a normal case these are not very unexpected or to-be-ashamed-of problems but for a product that claims to be Public Cloud you want to think of these things and address them before day 1. Cert expiry for example - you should be giving your users tools to never have a cert expire - not fixing it for your stuff after the fact! (Most CAs offer API to automate all this - no excuse for it.)

I don't mean to be dismissive or disrespectful, the problem is challenging and the work is great - merely thinking of loss of customer trust - people are never going to trust a new comer that has issues like this and for that reason move fast break things and fix when you find isn't a good fit for this kind of a product.

It's not a "differentiating feature"; it eliminated a scaling bottleneck. It's also a decision that long predates Corrosion.

I was referring to the "HTTP request in Tokyo to find the nearest instance in Sydney" part which felt to me like a differentiating feature- no other cloud provider seems to have bidding or HTTP request level cross regional lookup or whatever.

The "decision that long predates Corrosion" is precisely the point I was trying to make - was it made too soon before understanding the ramifications and/or having a validated technical solution ready? IOW maybe the feature requiring the problem solution could have come later? (I don't know much about fly.io and its features, so apologies if some of this is unclear/wrongly assumes things.)

That's literally the premise of the service and always has been.

fwiw, I'm happily running a company and some contract work on fly literally as aws, but what if it weren't the most massively complex pile of shit you've ever seen.

I have a couple reasonably sized, understandable toml files and another 100 lines of ruby that runs long-running rake tasks as individual fly machines. The whole thing works really nicely.