Skip to content

Tasks come up in unhealthy state due to lack of Ingress security group rules #57

@aric49

Description

@aric49

Bug report

What is the problem?
Perhaps I am missing something, but it appears that when deploying FarGate services using this module they will come up in an "unhealthy" state due to target group healthchecks timing out. Upon further investigation, it appears that the default security group created with the Fargate services does not have any "ingress" rules, only egress rules:

From main.tf:

# ------------------------------------------------------------------------------
# Security groups
# ------------------------------------------------------------------------------
resource "aws_security_group" "ecs_service" {
  vpc_id      = var.vpc_id
  name        = "${var.name_prefix}-ecs-service-sg"
  description = "Fargate service security group"
  tags = merge(
    var.tags,
    {
      Name = "${var.name_prefix}-sg"
    },
  )
}

resource "aws_security_group_rule" "egress_service" {
  security_group_id = aws_security_group.ecs_service.id
  type              = "egress"
  protocol          = "-1"
  from_port         = 0
  to_port           = 0
  cidr_blocks       = ["0.0.0.0/0"]
  ipv6_cidr_blocks  = ["::/0"]
}

I think this can be resolved by creating an ingress security group rule for the container_port

Steps to reproduce

Please post the relevant parts of the failing terraform code here (remember to remove sensitive information):

module "fargate-service" {
  source  = "telia-oss/ecs-fargate/aws"
  version = "5.2.0"

  name_prefix          = "${terraform.workspace}-${var.container_service_name}"
  vpc_id               = data.aws_vpc.primary_vpc.id
  private_subnet_ids   = data.aws_subnet_ids.private.ids
  lb_arn               = data.aws_lb.primary_public.arn
  cluster_id           = data.aws_ecs_cluster.primary_ecs_cluster.arn
  task_container_image = var.container_image_tag
  desired_count        = var.application_instance_count[terraform.workspace]

  task_container_assign_public_ip = false

  task_container_port = var.application_port

  task_definition_cpu = var.application_cpu[terraform.workspace]

  task_definition_memory = var.application_memory[terraform.workspace]

  service_registry_arn              = resource.aws_service_discovery_service.application-sd.arn
  with_service_discovery_srv_record = false

  deployment_circuit_breaker = { "enable" : true, "rollback" : true }
  wait_for_steady_state      = true

  task_container_environment = var.application_environment_variables[terraform.workspace]

  health_check = {
    port = "traffic-port"
    path = var.application_healthcheck_path
  }

  tags = {
    Environment = terraform.workspace
    Terraform   = "True"
  }
}

Terraform version

Run terraform version and post the output here:

Terraform v1.0.5
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v3.56.0
+ provider registry.terraform.io/hashicorp/null v3.1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions