Skip to content

Introduce Solid Queue worker task#1558

Closed
sarahseewhy wants to merge 5 commits into
mainfrom
run-solid-queue-as-ecs-task
Closed

Introduce Solid Queue worker task#1558
sarahseewhy wants to merge 5 commits into
mainfrom
run-solid-queue-as-ecs-task

Conversation

@sarahseewhy
Copy link
Copy Markdown
Contributor

@sarahseewhy sarahseewhy commented May 7, 2025

What problem does this pull request solve?

Trello card: https://trello.com/c/f7js82Cm/635-run-solid-queue-as-its-own-ecs-task

This pull request creates new ECS task, service, and relevant configuration to run Solid Queue as a separate worker task in our ECS cluster. It contains the same ENV variables as the forms-runner web app (including Sentry config like DSN, theoretically this means the work will also send alerts to Sentry).

Where possible I reused code, for example the task_container_definition, with a few overrides (similar to the mailchimp-sync approach).

The new ECS task should have egress access to the VPC, RDS, and the internet, but no ingress access (which happens by default unless otherwise specified in Terraform). The new ECS task should also exist in the same VPC and private subnet group as the other forms-runner tasks/services.

I used the same security group rules configured in modules/ecs-service/security-groups.tf since they should match the existing forms-runner ECS task with the exception of the restricted ingress access.

I've added a HealthCheck which depends on a PR in forms-runner (some Rails code to create the healthcheck file, see here).

I can also confirm that logs from the new ECS task are shipped to Splunk, use query index="gds_dsp_dev_forms" log_stream="forms-runner-dev-queue-worker*" to view.

I've tested the changes in dev (and locally) and have attached some screen shots:

Screenshot 2025-05-07 at 16 50 12 Screenshot 2025-05-07 at 16 44 59

Things to consider when reviewing

  • Naming. I'm always looking for improved naming.
  • File and resource location.
  • The network configuration is as expected and matches what's described in the Trello card.

@sarahseewhy sarahseewhy force-pushed the run-solid-queue-as-ecs-task branch from 8c70c6f to 1f22ea2 Compare May 9, 2025 14:15
@sarahseewhy sarahseewhy force-pushed the run-solid-queue-as-ecs-task branch 11 times, most recently from aa50231 to 67bebf8 Compare May 22, 2025 14:30
Copy link
Copy Markdown
Contributor

@AP-Hunt AP-Hunt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, bar a couple small style things.

I reviewed it commit by commit without reading all the commits before I started, so I've made some comments you can dismiss because they're not relevant to the final product.

Comment thread infra/deployments/forms/forms-runner/.terraform.lock.hcl Outdated
Comment thread infra/modules/forms-runner/queue-worker-ecs-task.tf Outdated
Comment thread infra/modules/forms-runner/queue-worker-ecs-task.tf Outdated
Comment thread infra/modules/forms-runner/queue-worker-ecs-task.tf Outdated
Comment thread infra/modules/forms-runner/queue-worker-ecs-task.tf Outdated
Comment thread infra/modules/forms-runner/queue-worker-ecs-task.tf Outdated
Comment thread infra/modules/forms-runner/queue-worker-ecs-task.tf Outdated
@sarahseewhy sarahseewhy force-pushed the run-solid-queue-as-ecs-task branch from 67bebf8 to ce7d9a2 Compare May 28, 2025 07:56
@sarahseewhy sarahseewhy marked this pull request as draft May 28, 2025 09:15
@sarahseewhy sarahseewhy force-pushed the run-solid-queue-as-ecs-task branch 8 times, most recently from 752d237 to 3695c70 Compare June 2, 2025 10:39
* Create ECS service, ECS task definition, security group, and relevant security group rules. I used a pattern I noticed in mailchimp sync where I took the exported ECS task container definition and override some parts of it
* Remove lock file which shouldn't have been committed
* Add a Parameter Store entry for a forms-runner-queue-worker specific Sentry DSN
* Provide the task access to a subset of secrets (I checked with the devs and it will only need access to these secrets, the worker won't need access to NOTIFY_API_KEY or SUBMISSION_STATUS_API)
* Create an IAM task role for the queue worker with permissions to write to CloudWatch (I don't think it needs the other permissions forms-runner has, e.g., SMS, SQS, KMS decryption)
* Create an IAM task execution role for the queue worker with permissions to read relevant Parameter Store values (again, I don't think the queue worker needs access to all the secrets forms-runner has access to).
* This code somewhat duplicates what's in `ecs-service/iam.tf` and the policies declared in `forms-runner/main.tf` but I think the duplication is simpler than forcing the ECS task into the ecs-service module format.
* Use the `forms-runner` task role (`module.ecs_service.task_role_arn`). The previous configuration generated an error because the queue worker didn't have IAM permissions to access the SQS `submission_email_ses_bounces_and_complaints` queue
* Extract queue worker name into a local variable to make it easier to change the name in the future. Possibly unnecessary but it was a right faff having to write `forms-runner-queue-worker` all over the place
* Rename file to make it a bit more generic
* Remove SSM Parameter and raise this in a separate PR.
* This is to ensure the secret exists before the ECS tasks are created in all environments so there are no failures on creation.
* If I leave the SSM Parameter in this PR the dummy value will exist when the ECS task gets created and there will be failures.
@sarahseewhy sarahseewhy force-pushed the run-solid-queue-as-ecs-task branch from 430c628 to 356052e Compare June 2, 2025 11:07
@sarahseewhy
Copy link
Copy Markdown
Contributor Author

Closing this PR because I believe the history is messy to the point of confusion. Recreated the PR here: #1592

@sarahseewhy sarahseewhy closed this Jun 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants