Skip to content

Commit a3601db

Browse files
lorenyuclaude
andauthored
Add documentation for deletion protection in temporary environments (#1007)
- Adds a new doc at `docs/infra/deletion-protection-and-temporary-environments.md` explaining how the `is_temporary` convention gates deletion protection for temporary environments (PR environments and platform-test CI) - Documents the `terraform.workspace != "default"` pattern, all resources currently using it, and step-by-step guidance for adding deletion protection to new resources - Explains consequences of not following the pattern (orphaned resources, failed cleanup workflows) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent c845769 commit a3601db

File tree

1 file changed

+91
-0
lines changed

1 file changed

+91
-0
lines changed
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# Deletion protection and temporary environments
2+
3+
Resources like databases, load balancers, and S3 buckets have deletion protection enabled to prevent accidental data loss in permanent environments. However, temporary environments — such as [pull request environments](./pull-request-environments.md) and CI test environments — need to be destroyed automatically when they're no longer needed. This document explains the convention used to conditionally disable deletion protection in temporary environments so that cleanup workflows can destroy resources without manual intervention.
4+
5+
## How temporary environments are identified
6+
7+
Temporary environments are identified using [Terraform workspaces](./develop-and-test-infrastructure-in-isolation-using-workspaces.md). Each Terraform root module defines a local variable `is_temporary` based on whether the current workspace is the `default` workspace:
8+
9+
```hcl
10+
# infra/<app_name>/service/main.tf
11+
12+
locals {
13+
# All non-default terraform workspaces are considered temporary.
14+
# Temporary environments do not have deletion protection enabled.
15+
# Examples: pull request preview environments are temporary.
16+
is_temporary = terraform.workspace != "default"
17+
}
18+
```
19+
20+
This local is then passed to child modules as a variable:
21+
22+
```hcl
23+
module "service" {
24+
...
25+
is_temporary = local.is_temporary
26+
}
27+
```
28+
29+
The `default` workspace is used for permanent environments (e.g. dev, staging, prod). Any other workspace — whether created by a PR environment workflow, a CI pipeline, or a developer testing in isolation — is considered temporary.
30+
31+
## The `is_temporary` pattern for gating deletion protection
32+
33+
Each module that manages a deletion-protected resource accepts an `is_temporary` variable and uses it to conditionally disable deletion protection. The variable defaults to `false` so that resources are protected unless explicitly marked as temporary.
34+
35+
Each module defines the variable the same way:
36+
37+
```hcl
38+
variable "is_temporary" {
39+
description = "Whether the service is meant to be spun up temporarily (e.g. for automated infra tests). This is used to disable deletion protection."
40+
type = bool
41+
default = false
42+
}
43+
```
44+
45+
The expression used depends on the resource's deletion protection attribute. For boolean attributes, negate `is_temporary`:
46+
47+
```hcl
48+
# Boolean deletion protection (e.g. ALB, RDS)
49+
enable_deletion_protection = !var.is_temporary
50+
51+
# S3 force destroy
52+
force_destroy = var.is_temporary
53+
```
54+
55+
Some resources use non-boolean values. For example, Cognito user pools use string values:
56+
57+
```hcl
58+
deletion_protection = var.is_temporary ? "INACTIVE" : "ACTIVE"
59+
```
60+
61+
Always keep the attribute on its own line with the standard comment so that the pattern is easy to find with grep:
62+
63+
```hcl
64+
# Use a separate line to support automated terraform destroy commands
65+
force_destroy = var.is_temporary
66+
```
67+
68+
Search for `is_temporary` across the codebase to see all resources currently using this pattern.
69+
70+
## What happens if you don't gate deletion protection
71+
72+
If a deletion-protected resource does not use the `is_temporary` pattern:
73+
74+
- **Cleanup workflows fail.** The PR environment destroy workflow and CI cleanup jobs run `terraform destroy` in non-default workspaces. If a resource has deletion protection unconditionally enabled, the destroy will fail.
75+
- **Orphaned resources accumulate.** Failed destroys leave resources running in AWS, accruing costs.
76+
- **Manual intervention is required.** Someone has to manually disable deletion protection and delete the orphaned resources.
77+
- **CI pipelines break.** Subsequent CI runs may fail due to naming conflicts with orphaned resources.
78+
79+
## Cleanup mechanisms that depend on this pattern
80+
81+
Automated cleanup workflows rely on `is_temporary` being properly gated:
82+
83+
- **PR environment destroy workflows** — When a pull request is merged or closed, [pr-environment-destroy.yml](/.github/workflows/pr-environment-destroy.yml) runs `terraform destroy` in the PR's workspace and then deletes the workspace. See [Pull request environments](./pull-request-environments.md) for details.
84+
- **Developer workspace cleanup** — Developers working in [isolated workspaces](./develop-and-test-infrastructure-in-isolation-using-workspaces.md) run `terraform destroy` to clean up after merging their changes.
85+
- **CI infrastructure checks** — The [ci-infra.yml](/.github/workflows/ci-infra.yml) workflow creates temporary workspaces for infrastructure validation and destroys them after checks complete.
86+
87+
## See also
88+
89+
- [Pull request environments](./pull-request-environments.md)
90+
- [Develop and test infrastructure in isolation using workspaces](./develop-and-test-infrastructure-in-isolation-using-workspaces.md)
91+
- [Destroy infrastructure](./destroy-infrastructure.md)

0 commit comments

Comments
 (0)