Add healthchecks to ECS tasks#2022
Merged
Merged
Conversation
Previously, we didn't have any health checks configured in ECS, only at the load balancer. This meant that if the application inside the container was unhealthy, the load balancer would stop sending traffic to it, but ECS would keep running the task and not attempt to replace it. Also, introducing the otel sidecar (which has a healthcheck configured) caused the task to always appear with 'Unknown' health status in the ECS console, which is confusing. These health check values are cribbed from the review apps configuration for each app, and should be suitable for our production workloads. Finally, we exclude the health check configuration from the `task_container_definition` output: we only consume this output when creationg scheduled tasks / other copies of the task definition, and in those cases we don't want to inherit the health check configuration for the long-running service tasks. We could override the health check configuration in those cases, but it's simpler to just not include it in the output at all and require that any task that needs a health check sets it explicitly.
There was a problem hiding this comment.
Pull request overview
Adds ECS container-level health checks to the long-running Forms services so ECS can replace unhealthy tasks (and avoid “Unknown” task health when using the ADOT sidecar), implemented via a new healthcheck input on the shared ecs-service Terraform module.
Changes:
- Add a
healthcheckvariable toinfra/modules/ecs-serviceand wire it into the generated container definition. - Configure
/uphealth checks forforms-admin,forms-runner, andforms-product-pageECS services. - Attempt to prevent health checks from being inherited by consumers of
task_container_definitionoutput (for scheduled/derived tasks).
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| infra/modules/forms-runner/main.tf | Adds container health check config and centralizes container port in a local. |
| infra/modules/forms-product-page/main.tf | Adds container health check config and centralizes container port in a local. |
| infra/modules/forms-admin/main.tf | Adds container health check config and centralizes container port in a local. |
| infra/modules/ecs-service/variables.tf | Introduces healthcheck module input type. |
| infra/modules/ecs-service/ecs.tf | Adds healthCheck into the generated task container definition. |
| infra/modules/ecs-service/outputs.tf | Modifies task_container_definition output to try to exclude health check config. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cadmiumcat
approved these changes
Mar 18, 2026
Contributor
cadmiumcat
left a comment
There was a problem hiding this comment.
So glad to have this! thanks. And good call on making the output null
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this pull request solve?
Trello card: https://trello.com/c/smia5MUD/834-configure-container-health-check-for-our-apps
Previously, we didn't have any health checks configured in ECS, only
at the load balancer. This meant that if the application inside the
container was unhealthy, the load balancer would stop sending traffic to
it, but ECS would keep running the task and not attempt to replace it.
Also, introducing the otel sidecar (which has a healthcheck configured)
caused the task to always appear with 'Unknown' health status in the ECS
console, which is confusing.
These health check values are cribbed from the review apps configuration
for each app, and should be suitable for our production workloads.
Finally, we exclude the health check configuration from the
task_container_definitionoutput: we only consume this output whencreationg scheduled tasks / other copies of the task definition, and in
those cases we don't want to inherit the health check configuration
for the long-running service tasks. We could override the health check
configuration in those cases, but it's simpler to just not include it in
the output at all and require that any task that needs a health check
sets it explicitly.
Things to consider when reviewing
Reminders
If you've made changes to the deployer role (files in
modules/deployer-access):make <environment> forms/account applyon the relevant environments (dev,staging,user-research, and/orprod)apply-forms-terraform-<environment>pipelines have run successfully