I wonder if there’s enough space for a Do Well By Doing Good company out there to provide a ladder from cheap self managed up to fully automated rolling upgrades.

Because it was mostly fine at first, but later we had some close calls when there were changes that needed to be made on the servers. By the time we managed to mess up our hand managed incremental restart process, we had several layers of cache and so accidentally wiping one didn’t murder our backend, but did throw enough alerts to cause a P2. And because we were doing manual bucketing of caches instead of consistent hashing we hit the OOMKiller a couple times while dialing in.

But at this point it was difficult to move back to managed.

This feels closest to digital ocean’s business model.