Your AWS backup snapshots must go one-way (append-only) to a separate AWS account, to which access is extremely limited and never has any automated tools connecting to with anything other than read access. I don’t think it costs more to do that—-but it takes your backups out of the blast radius of a root or admin account compromise OR a tool malfunction. With AWS DLM, you can safely configure your backup retention in the separate AWS account and not risk any tools deleting them.
Terraform is a ticking time bomb. All it takes is for a new field to show up in AWS or a new state in an existing field, and now your resource is not modified, but is destroyed and re-created.
I will never trust any process, AI or a CD pipeline, execute `terraform apply` automatically on anything production. Maybe if you examine the plan for a very narrow set of changes and then execute apply from that plan only, maybe then you can automate it. I think it’s much rarer for Terraform to deviate from a plan.
Regardless, you must always turn on Delete Protection on all your important resources. It is wild to me that AWS didn’t ship EKS with delete protection out of the gate—-they only added this feature in August 2025! Not long before that, I’ve witnessed a production database get deleted because Terraform decided that an AWS EKS cluster could not be modified, so it decided to delete it and re-create it, while the team was trying to upgrade the version of EKS. The same exact pipeline worked fine in the staging environment. Turns out production had a slight difference due to AWS API changes, and Terraform decided it could not modify.
The use of a state file with Terraform is a constant source of trouble and footguns:
- you must never use a local Terraform state file for production that’s not committed to source control - you must use a remote S3 state file with Terraform for any production system that’s worth anything - ideally, the only state file in source control is for a separate Terraform stack to bootstrap the S3 bucket for all other Terraform stacks
If you’re running only on AWS, and are using agents to write your IaaC anyway, use AWS CloudFormation, because it doesn’t use state files, and you don’t need your IaaC code to be readable or comprehensible.