This document summarizes the comprehensive optimization and modernization of the Kubernetes deployment Terraform module.
- Kubernetes Provider: Updated from
~> 2.0to~> 2.35 - Google Provider: Updated from
~> 6.0to~> 7.0(major version upgrade) - Workload Identity Module: Updated from unversioned to
~> 41.0 - Terraform Version: Now requires
>= 1.3(for better optional attribute support)
Google Provider v7 Changes:
- Major version upgrade with improved resource handling
- No breaking changes for this module (only uses workload-identity submodule)
- Better error messages and validation
- Performance improvements
- Full backwards compatibility maintained
Before:
env {
name = "DD_SERVICE"
value = "ddm-platform-${var.application_name}"
}After:
observability_config = {
agent_host_env_vars = ["DD_AGENT_HOST", "DD_TRACE_AGENT_HOSTNAME"]
service_name_prefix = "" # Configurable, no hardcoded prefix
service_env_var = "DD_SERVICE"
version_env_var = "DD_VERSION"
}Benefits:
- Works with any APM tool (Datadog, OpenTelemetry, New Relic, etc.)
- Completely optional - can be disabled by setting to
null - Prefix is configurable instead of hardcoded
- No longer assumes Datadog is always needed
Before:
variable "node_pool" {
default = "standard4" # Company-specific default
}After:
variable "node_pool" {
default = null # No default affinity
}Benefits:
- Works with any Kubernetes cluster
- No assumptions about node pool names
- Affinity only applied when explicitly configured
Before:
- Required variables:
project,gke_cluster_name,service_account_name - Always created GCP service accounts and IAM bindings
After:
variable "project" {
default = null # Optional
}
variable "gke_cluster_name" {
default = null # Optional
}
locals {
enable_workload_identity = var.project != null && var.gke_cluster_name != null
}Benefits:
- Works with any Kubernetes cluster (EKS, AKS, on-prem, etc.)
- GCP features only activate when GCP variables are provided
- Falls back to basic Kubernetes service account when not using GKE
Before:
variable "roles" {
default = ["roles/secretmanager.secretAccessor"] # GCP-specific default
}After:
variable "roles" {
default = [] # No default roles
}Benefits:
- No assumptions about required permissions
- Users explicitly declare needed roles
- Cleaner for non-GCP users
Before:
annotations = {
"cloud.google.com/neg" = "{\"ingress\": true}" # Always applied
}After:
variable "enable_neg_annotation" {
default = false
}
annotations = var.enable_neg_annotation ? {
"cloud.google.com/neg" = "{\"ingress\": true}"
} : {}Benefits:
- Only applied when explicitly enabled
- Doesn't break non-GKE deployments
- Clear opt-in behavior
Before (Bug):
match_labels = {
"app.kubernetes.io/instance" = "var.application_name" # String literal, not variable
}After (Fixed):
match_labels = {
app = var.application_name # Actual variable reference
}Before (Bug):
host_aliases {
hostnames = var.host_alias.hostnames # Would fail if var.host_alias is null
ip = var.host_alias.ip
}After (Fixed):
dynamic "host_aliases" {
for_each = var.host_alias != null ? [var.host_alias] : []
content {
hostnames = host_aliases.value.hostnames
ip = host_aliases.value.ip
}
}Before:
variable "service_account_name" {
type = string # Required
}After:
variable "service_account_name" {
type = string
default = null
}
locals {
service_account_name = var.service_account_name != null ? var.service_account_name : "${var.application_name}-sa"
}Benefits:
- One less required variable
- Sensible computed default
- Still overridable when needed
Added conditional creation logic throughout:
module "deployment_workload_identity" {
count = local.enable_workload_identity ? 1 : 0
# ...
}
resource "kubernetes_service_account" "basic" {
count = local.enable_workload_identity ? 0 : 1
# ...
}Benefits:
- Resources only created when needed
- Cleaner state files
- No unnecessary API calls
Applied all the same improvements to modules/cronjobs:
- Provider version updates
- Optional observability configuration
- Optional workload identity
- Computed service account name
- Conditional resource creation
Added new GCP-specific optimization variables:
Existing GCP Service Account Support:
use_existing_gcp_sa = true
existing_gcp_sa_email = "my-sa@project.iam.gserviceaccount.com"Benefits:
- Reuse existing service accounts with pre-configured permissions
- Useful for shared service accounts across multiple workloads
- Reduces service account sprawl
Enhanced Security:
automount_service_account_token = falseBenefits:
- Prevents automatic mounting of service account tokens
- Enhanced security posture when using workload identity
- Follows principle of least privilege
Regional/Zonal Configuration:
gke_location = "us-central1" # or "us-central1-a" for zonalBenefits:
- Properly configure workload identity for regional/zonal clusters
- Better alignment with GKE cluster configuration
- Required for some GCP APIs
main.tf- Provider versions, computed localsvariables.tf- New optional variables, removed defaultskubernetes_deployment.tf- Dynamic observability, optional node affinity, fixed bugskubernetes_service.tf- Conditional NEG annotationkubernetes_service_account.tf- Conditional workload identityoutputs.tf- Updated for conditional resourcesREADME.md- Comprehensive new documentationMIGRATION.md- Detailed migration guide
modules/cronjobs/main.tf- Provider versions, computed localsmodules/cronjobs/variables.tf- New optional variablesmodules/cronjobs/kubernetes-deployment.tf- Dynamic observabilitymodules/cronjobs/kubernetes-service-account.tf- Conditional workload identitymodules/cronjobs/outputs.tf- Updated for conditional resources
- observability_config - Must be explicitly set to enable Datadog or other APM
- node_pool - Now defaults to
nullinstead of"standard4" - roles - Now defaults to
[]instead of["roles/secretmanager.secretAccessor"] - project/gke_cluster_name - Now optional (null by default)
- enable_neg_annotation - New variable, defaults to
false - node_pool - Change from string to list. This will allow us to use multiple node pools
To maintain v3.x behavior, users should add these to their module calls:
node_pool = "standard4"
roles = ["roles/secretmanager.secretAccessor"]
observability_config = {
agent_host_env_vars = ["DD_AGENT_HOST", "DD_TRACE_AGENT_HOSTNAME"]
service_name_prefix = "ddm-platform-"
service_env_var = "DD_SERVICE"
version_env_var = "DD_VERSION"
}
enable_neg_annotation = true- Less Rigid: No longer assumes specific node pools or naming conventions
- More Reusable: Can be used across different projects and environments
- Better Defaults: Sensible defaults that don't assume company-specific infrastructure
- Cleaner Code: Fixed bugs and removed hardcoded values
- Universal: Works with any Kubernetes cluster, not just GKE
- Tool Agnostic: Support any APM/observability tool
- Cloud Agnostic: GCP features are optional, not required
- Best Practices: Follows Terraform module best practices
- Latest Providers: Using current provider versions with latest features
- Bug Fixes: Fixed topology spread and host aliases bugs
- Better Documentation: Comprehensive README and migration guide
- Type Safety: Better use of optional types and computed values
- Test in Non-Production First: Validate changes in dev/staging
- Review terraform plan: Check for unexpected changes
- Test Different Configurations:
- Basic Kubernetes (no GCP)
- With Workload Identity
- With observability
- With node affinity
- Validate Outputs: Ensure outputs match expected values
- Update any downstream modules or repos that use this module
- Update CI/CD pipelines if needed
- Update team documentation
- Consider creating example configurations for common use cases
- Plan rollout strategy for existing deployments
See MIGRATION.md for detailed migration instructions, or open an issue in the repository.