>That's why I prefer Terraform's approach of having a "plan" mode. It doesn't just tell you what it would do but does so in the form of a plan it can later execute programmatically. Then, if any of the assumptions made during planning have changed, it can abort and roll back.

Not to take anything away from your comment but just to add a related story... the previous big AWS outage had an unforeseen race condition between their DNS planner vs DNS executor:

>[...] Right before this event started, one DNS Enactor experienced unusually high delays needing to retry its update on several of the DNS endpoints. As it was slowly working through the endpoints, several other things were also happening. First, the DNS Planner continued to run and produced many newer generations of plans. Second, one of the other DNS Enactors then began applying one of the newer plans and rapidly progressed through all of the endpoints. The timing of these events triggered the latent race condition. When the second Enactor (applying the newest plan) completed its endpoint updates, it then invoked the plan clean-up process, which identifies plans that are significantly older than the one it just applied and deletes them. At the same time that this clean-up process was invoked, the first Enactor (which had been unusually delayed) applied its much older plan to the regional DDB endpoint, overwriting the newer plan. The check that was made at the start of the plan application process, which ensures that the plan is newer than the previously applied plan, was stale by this time due to the unusually high delays in Enactor processing. [...]

previous HN thread: https://news.ycombinator.com/item?id=45677139

Overkill I’m sure for many things but I’m curious as to whether there’s a TLA kind of solution for this sort of thing. It feels like it could although it depends how well modelled things are (also aware this is a 30s thought and lots of better qualified people work on this full time).

Or something like https://antithesis.com/