Terraform Module Optimization - Summary of Changes

Overview

This document summarizes the comprehensive optimization and modernization of the Kubernetes deployment Terraform module.

Major Changes

1. Provider Version Updates

Kubernetes Provider: Updated from ~> 2.0 to ~> 2.35
Google Provider: Updated from ~> 6.0 to ~> 7.0 (major version upgrade)
Workload Identity Module: Updated from unversioned to ~> 41.0
Terraform Version: Now requires >= 1.3 (for better optional attribute support)

Google Provider v7 Changes:

Major version upgrade with improved resource handling
No breaking changes for this module (only uses workload-identity submodule)
Better error messages and validation
Performance improvements
Full backwards compatibility maintained

2. Removed Company-Specific Hardcoded Values

Datadog/Observability Configuration

Before:

env {
  name  = "DD_SERVICE"
  value = "ddm-platform-${var.application_name}"
}

After:

observability_config = {
  agent_host_env_vars = ["DD_AGENT_HOST", "DD_TRACE_AGENT_HOSTNAME"]
  service_name_prefix = ""  # Configurable, no hardcoded prefix
  service_env_var     = "DD_SERVICE"
  version_env_var     = "DD_VERSION"
}

Benefits:

Works with any APM tool (Datadog, OpenTelemetry, New Relic, etc.)
Completely optional - can be disabled by setting to null
Prefix is configurable instead of hardcoded
No longer assumes Datadog is always needed

Node Pool Affinity

Before:

variable "node_pool" {
  default = "standard4"  # Company-specific default
}

After:

variable "node_pool" {
  default = null  # No default affinity
}

Benefits:

Works with any Kubernetes cluster
No assumptions about node pool names
Affinity only applied when explicitly configured

3. Made GCP-Specific Features Optional

Workload Identity

Before:

Required variables: project, gke_cluster_name, service_account_name
Always created GCP service accounts and IAM bindings

After:

variable "project" {
  default = null  # Optional
}

variable "gke_cluster_name" {
  default = null  # Optional
}

locals {
  enable_workload_identity = var.project != null && var.gke_cluster_name != null
}

Benefits:

Works with any Kubernetes cluster (EKS, AKS, on-prem, etc.)
GCP features only activate when GCP variables are provided
Falls back to basic Kubernetes service account when not using GKE

IAM Roles

Before:

variable "roles" {
  default = ["roles/secretmanager.secretAccessor"]  # GCP-specific default
}

After:

variable "roles" {
  default = []  # No default roles
}

Benefits:

No assumptions about required permissions
Users explicitly declare needed roles
Cleaner for non-GCP users

NEG Annotations

Before:

annotations = {
  "cloud.google.com/neg" = "{\"ingress\": true}"  # Always applied
}

After:

variable "enable_neg_annotation" {
  default = false
}

annotations = var.enable_neg_annotation ? {
  "cloud.google.com/neg" = "{\"ingress\": true}"
} : {}

Benefits:

Only applied when explicitly enabled
Doesn't break non-GKE deployments
Clear opt-in behavior

4. Fixed Bugs

Topology Spread Label Selector

Before (Bug):

match_labels = {
  "app.kubernetes.io/instance" = "var.application_name"  # String literal, not variable
}

After (Fixed):

match_labels = {
  app = var.application_name  # Actual variable reference
}

Host Aliases

Before (Bug):

host_aliases {
  hostnames = var.host_alias.hostnames  # Would fail if var.host_alias is null
  ip        = var.host_alias.ip
}

After (Fixed):

dynamic "host_aliases" {
  for_each = var.host_alias != null ? [var.host_alias] : []
  content {
    hostnames = host_aliases.value.hostnames
    ip        = host_aliases.value.ip
  }
}

5. Service Account Name Computed Default

Before:

variable "service_account_name" {
  type = string  # Required
}

After:

variable "service_account_name" {
  type    = string
  default = null
}

locals {
  service_account_name = var.service_account_name != null ? var.service_account_name : "${var.application_name}-sa"
}

Benefits:

One less required variable
Sensible computed default
Still overridable when needed

6. Conditional Resource Creation

Added conditional creation logic throughout:

module "deployment_workload_identity" {
  count = local.enable_workload_identity ? 1 : 0
  # ...
}

resource "kubernetes_service_account" "basic" {
  count = local.enable_workload_identity ? 0 : 1
  # ...
}

Benefits:

Resources only created when needed
Cleaner state files
No unnecessary API calls

7. Submodule Updates

Applied all the same improvements to modules/cronjobs:

Provider version updates
Optional observability configuration
Optional workload identity
Computed service account name
Conditional resource creation

8. Enhanced GCP/Workload Identity Features

Added new GCP-specific optimization variables:

Existing GCP Service Account Support:

use_existing_gcp_sa   = true
existing_gcp_sa_email = "my-sa@project.iam.gserviceaccount.com"

Benefits:

Reuse existing service accounts with pre-configured permissions
Useful for shared service accounts across multiple workloads
Reduces service account sprawl

Enhanced Security:

automount_service_account_token = false

Benefits:

Prevents automatic mounting of service account tokens
Enhanced security posture when using workload identity
Follows principle of least privilege

Regional/Zonal Configuration:

gke_location = "us-central1"  # or "us-central1-a" for zonal

Benefits:

Properly configure workload identity for regional/zonal clusters
Better alignment with GKE cluster configuration
Required for some GCP APIs

Files Modified

Root Module

main.tf - Provider versions, computed locals
variables.tf - New optional variables, removed defaults
kubernetes_deployment.tf - Dynamic observability, optional node affinity, fixed bugs
kubernetes_service.tf - Conditional NEG annotation
kubernetes_service_account.tf - Conditional workload identity
outputs.tf - Updated for conditional resources
README.md - Comprehensive new documentation
MIGRATION.md - Detailed migration guide

Cronjobs Submodule

modules/cronjobs/main.tf - Provider versions, computed locals
modules/cronjobs/variables.tf - New optional variables
modules/cronjobs/kubernetes-deployment.tf - Dynamic observability
modules/cronjobs/kubernetes-service-account.tf - Conditional workload identity
modules/cronjobs/outputs.tf - Updated for conditional resources

Breaking Changes Summary

observability_config - Must be explicitly set to enable Datadog or other APM
node_pool - Now defaults to null instead of "standard4"
roles - Now defaults to [] instead of ["roles/secretmanager.secretAccessor"]
project/gke_cluster_name - Now optional (null by default)
enable_neg_annotation - New variable, defaults to false
node_pool - Change from string to list. This will allow us to use multiple node pools

Backwards Compatibility

To maintain v3.x behavior, users should add these to their module calls:

node_pool = "standard4"
roles     = ["roles/secretmanager.secretAccessor"]
observability_config = {
  agent_host_env_vars = ["DD_AGENT_HOST", "DD_TRACE_AGENT_HOSTNAME"]
  service_name_prefix = "ddm-platform-"
  service_env_var     = "DD_SERVICE"
  version_env_var     = "DD_VERSION"
}
enable_neg_annotation = true

Benefits of This Update

For Your Team

Less Rigid: No longer assumes specific node pools or naming conventions
More Reusable: Can be used across different projects and environments
Better Defaults: Sensible defaults that don't assume company-specific infrastructure
Cleaner Code: Fixed bugs and removed hardcoded values

For Others

Universal: Works with any Kubernetes cluster, not just GKE
Tool Agnostic: Support any APM/observability tool
Cloud Agnostic: GCP features are optional, not required
Best Practices: Follows Terraform module best practices

For Maintenance

Latest Providers: Using current provider versions with latest features
Bug Fixes: Fixed topology spread and host aliases bugs
Better Documentation: Comprehensive README and migration guide
Type Safety: Better use of optional types and computed values

Testing Recommendations

Test in Non-Production First: Validate changes in dev/staging
Review terraform plan: Check for unexpected changes
Test Different Configurations:
- Basic Kubernetes (no GCP)
- With Workload Identity
- With observability
- With node affinity
Validate Outputs: Ensure outputs match expected values

Next Steps

Update any downstream modules or repos that use this module
Update CI/CD pipelines if needed
Update team documentation
Consider creating example configurations for common use cases
Plan rollout strategy for existing deployments

Questions or Issues

See MIGRATION.md for detailed migration instructions, or open an issue in the repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Terraform Module Optimization - Summary of Changes

Overview

Major Changes

1. Provider Version Updates

2. Removed Company-Specific Hardcoded Values

Datadog/Observability Configuration

Node Pool Affinity

3. Made GCP-Specific Features Optional

Workload Identity

IAM Roles

NEG Annotations

4. Fixed Bugs

Topology Spread Label Selector

Host Aliases

5. Service Account Name Computed Default

6. Conditional Resource Creation

7. Submodule Updates

8. Enhanced GCP/Workload Identity Features

Files Modified

Root Module

Cronjobs Submodule

Breaking Changes Summary

Backwards Compatibility

Benefits of This Update

For Your Team

For Others

For Maintenance

Testing Recommendations

Next Steps

Questions or Issues

FilesExpand file tree

CHANGES.md

Latest commit

History

CHANGES.md

File metadata and controls

Terraform Module Optimization - Summary of Changes

Overview

Major Changes

1. Provider Version Updates

2. Removed Company-Specific Hardcoded Values

Datadog/Observability Configuration

Node Pool Affinity

3. Made GCP-Specific Features Optional

Workload Identity

IAM Roles

NEG Annotations

4. Fixed Bugs

Topology Spread Label Selector

Host Aliases

5. Service Account Name Computed Default

6. Conditional Resource Creation

7. Submodule Updates

8. Enhanced GCP/Workload Identity Features

Files Modified

Root Module

Cronjobs Submodule

Breaking Changes Summary

Backwards Compatibility

Benefits of This Update

For Your Team

For Others

For Maintenance

Testing Recommendations

Next Steps

Questions or Issues