Interesting approach. The managed self-hosting gap is real..we have run into this exact pain point with kubernetes based deployments where customers modify their cluster configs and things break silently. If I may ask how does Alien handle rollback if an update fails in a customer environment?is there any plan for on-prem/bare metal support beyond the big three clouds?

Alien is basically a huge state machine where every API call that mutates the environment is a discrete step, and the full state is durably persisted after each one.

If something fails mid-update, it resumes from exactly where it stopped. You can also point a deployment to a previous release and it walks back. This catches and recovers from issues that something like Terraform would just leave in a broken state.

For on-prem: we're working on Kubernetes as a deployment target (e.g. bare metal OpenShift)

i think the durable state machine approach is smart...that resume from where it stopped behavior is a big deal during incident response when you really dont want to rerun an entire deployment just because one step failed. K8s as a deployment target would be huge especially for the on-prem enterprise crowd. Will definitely keep an eye on that