Skip to content

Review AWS Batch resources #62

@victorlin

Description

@victorlin

Background and scope

This was motivated by a recent email regarding Amazon Linux 2 deprecation (details below). After some inspection, I thought it would be appropriate to broaden the scope to review all AWS Batch job queues and their associated compute environments, with a focus on the image deprecation.

Even though AWS Batch resources are currently not managed via Terraform, it seemed appropriate to open this issue under nextstrain/infra.

Overview

The table below lists all AWS Batch job queues and their associated compute environments (currently 1:1) in our AWS account.

command
jq -n -r '
def link($arn):
  "[\($arn|split("/")[-1])](https://us-east-1.console.aws.amazon.com/go/view?arn=\($arn|@uri))";

(input.computeEnvironments | map({(.computeEnvironmentArn): .}) | add) as $ce

| (
    ["| Job Queue | Compute Environment | Image Type | Status |",
     "|---|---|---|---|"]
    +
    (
      [
        input.jobQueues[]
        | .computeEnvironmentOrder[0].computeEnvironment as $arn
        | $ce[$arn] as $c
        | {
            status: $c.status,
            row: "| \(link(.jobQueueArn)) | \(link($arn)) | \($c.computeResources.ec2Configuration[0].imageType // "N/A") | \($c.status) |"
          }
      ]
      | sort_by(.status)
      | reverse
      | map(.row)
    )
  )
| .[]
' \
<(aws batch describe-compute-environments) \
<(aws batch describe-job-queues)
Job Queue Compute Environment Image Type Status
trs-test trs-test ECS_AL2 VALID
nextstrain-job-queue c5-instances-2023-01-17 ECS_AL2 VALID
nextstrain-job-queue-c5d c5d-instances-2021-10-21b ECS_AL2 VALID
nextstrain-job-queue-r5a r5a-instances ECS_AL1 INVALID
nextstrain-job-queue-r5 r5-instances ECS_AL1 INVALID
nextstrain-job-queue-z1d z1d-instances ECS_AL1 INVALID

Amazon Linux 1

Compute environments using Amazon Linux 1 (ECS_AL1) have been invalidated with the following reason:

CLIENT_ERROR - Your CE has been scaled down and invalidated because it is using Batch managed Amazon Linux(AL1) AMI. AWS Batch no longer supports Amazon Linux AMI. We recommend that you update or replace your compute environment and instead use Amazon Linux 2.

I'm not sure when this happened, but my guess would be at the end of 2023 according to the Amazon Linux 1 deprecation page.

Amazon Linux 2

Compute environments using Amazon Linux 2 (ECS_AL2) should remain valid and usable for the foreseeable future, but end of life has been scheduled for June 30 according to the Amazon Linux 2 deprecation page and email notification. From the email:

Existing compute environments will continue to operate, but will no longer receive software updates, security patches, or bug fixes from AWS after this date.


Initial review

  • The various resources are described in a Slack thread from April 2020.
  • The default job queue is nextstrain-job-queue. (src)
  • Internal usage only uses the default job queue, not any custom ones. (query)

Tasks

I propose the following action items. Any feedback is welcome.

  • Update nextstrain-job-queue to use a compute environment with Amazon Linux 2023
    • with Terraform?
  • Remove all other job queues and compute environments
  • Update docs to reflect new default of Amazon Linux 2023

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions