Infrastructure as Code (Terraform): Scenario-Based Questions
4. After applying Terraform changes, a critical resource was accidentally deleted. What went wrong, and how do you prevent it?
Accidental deletions in Terraform typically arise from state drift, incorrect configuration, or misunderstanding of resource lifecycle behavior. Mitigating such risks is crucial in production environments.
๐ Root Cause Analysis
- State Drift: The actual infrastructure changed outside of Terraform (manual edits or third-party tooling).
- Moved/Refactored Resources: Resource names or modules were renamed without proper
terraform state mv
commands. - Use of
count
/for_each
: Logic based on variable inputs led to removal when inputs changed. - Explicitly Removed from Code: Terraform sees a resource missing from the code as a deletion intent.
๐ Troubleshooting Steps
- Review the latest
terraform plan
output โ were deletions shown but ignored? - Check
terraform state list
andterraform show
to inspect what was tracked. - Investigate git diffs to see what changed in Terraform code before the apply.
- Check if
lifecycle.prevent_destroy
was missing for critical resources.
โ Prevention Strategies
- Use
lifecycle { prevent_destroy = true }
on critical resources like databases or production load balancers. - Implement reviewed Terraform plans in CI โ auto-apply only if no destructive changes detected.
- Enable backup of
terraform.tfstate
(e.g., S3 with versioning). - Use
terraform import
orstate mv
during refactors instead of deleting and recreating.
๐งช Tools for Mitigation
infracost
: Cost-based diff visualization can highlight high-risk deletes.tflint
andcheckov
: Static analysis for misconfigurations and security issues.terraform console
: Helps evaluate expressions in the plan.
๐ซ Common Mistakes
- Blindly running
terraform apply
without reading the plan. - Refactoring without preserving state mappings.
- Mixing environments (dev/prod) in the same state file.
๐ Real-World Insight
Many postmortems in cloud ops stem from unintended deletions due to overlooked Terraform diffs. Mature teams integrate plan review gates, use preventive lifecycle rules, and treat infrastructure code as sensitive as application logic.