|
| 1 | +# Deletion protection and temporary environments |
| 2 | + |
| 3 | +Resources like databases, load balancers, and S3 buckets have deletion protection enabled to prevent accidental data loss in permanent environments. However, temporary environments — such as [pull request environments](./pull-request-environments.md) and CI test environments — need to be destroyed automatically when they're no longer needed. This document explains the convention used to conditionally disable deletion protection in temporary environments so that cleanup workflows can destroy resources without manual intervention. |
| 4 | + |
| 5 | +## How temporary environments are identified |
| 6 | + |
| 7 | +Temporary environments are identified using [Terraform workspaces](./develop-and-test-infrastructure-in-isolation-using-workspaces.md). Each Terraform root module defines a local variable `is_temporary` based on whether the current workspace is the `default` workspace: |
| 8 | + |
| 9 | +```hcl |
| 10 | +# infra/<app_name>/service/main.tf |
| 11 | +
|
| 12 | +locals { |
| 13 | + # All non-default terraform workspaces are considered temporary. |
| 14 | + # Temporary environments do not have deletion protection enabled. |
| 15 | + # Examples: pull request preview environments are temporary. |
| 16 | + is_temporary = terraform.workspace != "default" |
| 17 | +} |
| 18 | +``` |
| 19 | + |
| 20 | +This local is then passed to child modules as a variable: |
| 21 | + |
| 22 | +```hcl |
| 23 | +module "service" { |
| 24 | + ... |
| 25 | + is_temporary = local.is_temporary |
| 26 | +} |
| 27 | +``` |
| 28 | + |
| 29 | +The `default` workspace is used for permanent environments (e.g. dev, staging, prod). Any other workspace — whether created by a PR environment workflow, a CI pipeline, or a developer testing in isolation — is considered temporary. |
| 30 | + |
| 31 | +## The `is_temporary` pattern for gating deletion protection |
| 32 | + |
| 33 | +Each module that manages a deletion-protected resource accepts an `is_temporary` variable and uses it to conditionally disable deletion protection. The variable defaults to `false` so that resources are protected unless explicitly marked as temporary. |
| 34 | + |
| 35 | +Each module defines the variable the same way: |
| 36 | + |
| 37 | +```hcl |
| 38 | +variable "is_temporary" { |
| 39 | + description = "Whether the service is meant to be spun up temporarily (e.g. for automated infra tests). This is used to disable deletion protection." |
| 40 | + type = bool |
| 41 | + default = false |
| 42 | +} |
| 43 | +``` |
| 44 | + |
| 45 | +The expression used depends on the resource's deletion protection attribute. For boolean attributes, negate `is_temporary`: |
| 46 | + |
| 47 | +```hcl |
| 48 | +# Boolean deletion protection (e.g. ALB, RDS) |
| 49 | +enable_deletion_protection = !var.is_temporary |
| 50 | +
|
| 51 | +# S3 force destroy |
| 52 | +force_destroy = var.is_temporary |
| 53 | +``` |
| 54 | + |
| 55 | +Some resources use non-boolean values. For example, Cognito user pools use string values: |
| 56 | + |
| 57 | +```hcl |
| 58 | +deletion_protection = var.is_temporary ? "INACTIVE" : "ACTIVE" |
| 59 | +``` |
| 60 | + |
| 61 | +Always keep the attribute on its own line with the standard comment so that the pattern is easy to find with grep: |
| 62 | + |
| 63 | +```hcl |
| 64 | +# Use a separate line to support automated terraform destroy commands |
| 65 | +force_destroy = var.is_temporary |
| 66 | +``` |
| 67 | + |
| 68 | +Search for `is_temporary` across the codebase to see all resources currently using this pattern. |
| 69 | + |
| 70 | +## What happens if you don't gate deletion protection |
| 71 | + |
| 72 | +If a deletion-protected resource does not use the `is_temporary` pattern: |
| 73 | + |
| 74 | +- **Cleanup workflows fail.** The PR environment destroy workflow and CI cleanup jobs run `terraform destroy` in non-default workspaces. If a resource has deletion protection unconditionally enabled, the destroy will fail. |
| 75 | +- **Orphaned resources accumulate.** Failed destroys leave resources running in AWS, accruing costs. |
| 76 | +- **Manual intervention is required.** Someone has to manually disable deletion protection and delete the orphaned resources. |
| 77 | +- **CI pipelines break.** Subsequent CI runs may fail due to naming conflicts with orphaned resources. |
| 78 | + |
| 79 | +## Cleanup mechanisms that depend on this pattern |
| 80 | + |
| 81 | +Automated cleanup workflows rely on `is_temporary` being properly gated: |
| 82 | + |
| 83 | +- **PR environment destroy workflows** — When a pull request is merged or closed, [pr-environment-destroy.yml](/.github/workflows/pr-environment-destroy.yml) runs `terraform destroy` in the PR's workspace and then deletes the workspace. See [Pull request environments](./pull-request-environments.md) for details. |
| 84 | +- **Developer workspace cleanup** — Developers working in [isolated workspaces](./develop-and-test-infrastructure-in-isolation-using-workspaces.md) run `terraform destroy` to clean up after merging their changes. |
| 85 | +- **CI infrastructure checks** — The [ci-infra.yml](/.github/workflows/ci-infra.yml) workflow creates temporary workspaces for infrastructure validation and destroys them after checks complete. |
| 86 | + |
| 87 | +## See also |
| 88 | + |
| 89 | +- [Pull request environments](./pull-request-environments.md) |
| 90 | +- [Develop and test infrastructure in isolation using workspaces](./develop-and-test-infrastructure-in-isolation-using-workspaces.md) |
| 91 | +- [Destroy infrastructure](./destroy-infrastructure.md) |
0 commit comments