Skip to content

Remote executor on Azure AKS and ACA#37

Open
max-datahub wants to merge 2 commits intomainfrom
remote-executor-azure
Open

Remote executor on Azure AKS and ACA#37
max-datahub wants to merge 2 commits intomainfrom
remote-executor-azure

Conversation

@max-datahub
Copy link

@max-datahub max-datahub commented Dec 11, 2025

Add Azure Remote Ingestion Executor Terraform Modules

Summary

This PR adds two new Terraform modules for deploying the DataHub Remote Ingestion Executor on Azure:

  • remote-ingestion-executor-aks - Deploy on Azure Kubernetes Service (AKS) using Helm
  • remote-ingestion-executor-aca - Deploy on Azure Container Apps (serverless)

These modules complement the existing AWS ECS module (remote-ingestion-executor), enabling customers to run remote executors in their Azure environments.

Changes

New Module: remote-ingestion-executor-aks

Deploys the Remote Executor on AKS using the bundled Helm chart.

Features:

  • Helm-based deployment with full configuration support
  • Azure Container Registry (ACR) integration
  • Azure Key Vault for secrets management
  • Private AKS cluster support for locked-down environments
  • Configurable resource limits, replicas, and worker counts
  • Pod Disruption Budget for high availability
  • Proxy and custom CA certificate support

New Module: remote-ingestion-executor-aca

Deploys the Remote Executor on Azure Container Apps as a serverless alternative.

Features:

  • Native ACA deployment (no Helm dependency)
  • Workload profile support for higher resources (4 CPU / 8Gi)
  • Managed Identity integration for ACR and Key Vault
  • VNet integration for locked-down environments
  • Simplified operations compared to AKS
  • Comprehensive troubleshooting documentation

Comparison: AKS vs ACA

Aspect AKS ACA
Complexity Higher (requires K8s cluster) Lower (serverless)
Cost Pay for cluster nodes Pay per usage
Max Resources Unlimited (node-dependent) 4 CPU / 8Gi per container
Helm Support ✅ Native ❌ Not supported
Scaling HPA/KEDA Built-in KEDA
Best For Production workloads, existing K8s Simpler deployments, cost optimization

Test Plan

  • AKS module: terraform plan validates successfully
  • AKS module: Helm chart templates render correctly
  • ACA module: terraform apply deploys successfully
  • ACA module: Container runs with 4 CPU / 8Gi on workload profile
  • ACA module: Executor connects to DataHub and processes ingestion tasks
  • Documentation reviewed for completeness

Files Changed

remote-ingestion-executor-aks/
├── .gitignore
├── README.md
├── main.tf
├── variables.tf
├── outputs.tf
├── versions.tf
├── helm-values.yaml.tpl
├── terraform.tfvars.example
└── datahub-executor-helm/          # Bundled Helm chart
    └── charts/datahub-executor-worker/

remote-ingestion-executor-aca/
├── .gitignore
├── README.md
├── main.tf
├── variables.tf
├── outputs.tf
├── versions.tf
├── locals.tf
└── example/main.tf

Usage Examples

AKS Deployment

module "datahub_executor_aks" {
  source = "./remote-ingestion-executor-aks"

  resource_group_name = "my-rg"
  aks_cluster_name    = "my-aks"
  datahub_gms_url     = "https://company.acryl.io/gms"
  datahub_gms_token   = var.gms_token
  executor_pool_id    = "azure-executor-pool"
}

ACA Deployment

module "datahub_executor_aca" {
  source = "./remote-ingestion-executor-aca"

  resource_group_name = "my-rg"
  location            = "eastus"
  datahub_gms_url     = "https://company.acryl.io/gms"
  datahub_gms_token   = var.gms_token
  executor_pool_id    = "azure-executor-pool"
  
  # For 4 CPU / 8Gi resources
  workload_profile_name = "dedicated"
  cpu                   = 4.0
  memory                = "8Gi"
}

Related Documentation

Add Azure Remote Ingestion Executor Terraform Modules

Summary

This PR adds two new Terraform modules for deploying the DataHub Remote Ingestion Executor on Azure:

  • remote-ingestion-executor-aks - Deploy on Azure Kubernetes Service (AKS) using Helm
  • remote-ingestion-executor-aca - Deploy on Azure Container Apps (serverless)

These modules complement the existing AWS ECS module (remote-ingestion-executor), enabling customers to run remote executors in their Azure environments.

Changes

New Module: remote-ingestion-executor-aks

Deploys the Remote Executor on AKS using the bundled Helm chart.

Features:

  • Helm-based deployment with full configuration support
  • Azure Container Registry (ACR) integration
  • Azure Key Vault for secrets management
  • Private AKS cluster support for locked-down environments
  • Configurable resource limits, replicas, and worker counts
  • Pod Disruption Budget for high availability
  • Proxy and custom CA certificate support

New Module: remote-ingestion-executor-aca

Deploys the Remote Executor on Azure Container Apps as a serverless alternative.

Features:

  • Native ACA deployment (no Helm dependency)
  • Workload profile support for higher resources (4 CPU / 8Gi)
  • Managed Identity integration for ACR and Key Vault
  • VNet integration for locked-down environments
  • Simplified operations compared to AKS
  • Comprehensive troubleshooting documentation

Comparison: AKS vs ACA

Aspect AKS ACA
Complexity Higher (requires K8s cluster) Lower (serverless)
Cost Pay for cluster nodes Pay per usage
Max Resources Unlimited (node-dependent) 4 CPU / 8Gi per container
Helm Support ✅ Native ❌ Not supported
Scaling HPA/KEDA Built-in KEDA
Best For Production workloads, existing K8s Simpler deployments, cost optimization

Test Plan

  • AKS module: terraform plan validates successfully
  • AKS module: Helm chart templates render correctly
  • ACA module: terraform apply deploys successfully
  • ACA module: Container runs with 4 CPU / 8Gi on workload profile
  • ACA module: Executor connects to DataHub and processes ingestion tasks
  • Documentation reviewed for completeness

Files Changed

remote-ingestion-executor-aks/
├── .gitignore
├── README.md
├── main.tf
├── variables.tf
├── outputs.tf
├── versions.tf
├── helm-values.yaml.tpl
├── terraform.tfvars.example
└── datahub-executor-helm/          # Bundled Helm chart
    └── charts/datahub-executor-worker/

remote-ingestion-executor-aca/
├── .gitignore
├── README.md
├── main.tf
├── variables.tf
├── outputs.tf
├── versions.tf
├── locals.tf
└── example/main.tf

Usage Examples

AKS Deployment

module "datahub_executor_aks" {
  source = "./remote-ingestion-executor-aks"

  resource_group_name = "my-rg"
  aks_cluster_name    = "my-aks"
  datahub_gms_url     = "https://company.acryl.io/gms"
  datahub_gms_token   = var.gms_token
  executor_pool_id    = "azure-executor-pool"
}

ACA Deployment

module "datahub_executor_aca" {
  source = "./remote-ingestion-executor-aca"

  resource_group_name = "my-rg"
  location            = "eastus"
  datahub_gms_url     = "https://company.acryl.io/gms"
  datahub_gms_token   = var.gms_token
  executor_pool_id    = "azure-executor-pool"
  
  # For 4 CPU / 8Gi resources
  workload_profile_name = "dedicated"
  cpu                   = 4.0
  memory                = "8Gi"
}

Related Documentation


Note

Adds Terraform modules for deploying the DataHub Remote Ingestion Executor on Azure AKS (Helm-based) and ACA (serverless) with managed identity/ACR, Key Vault, VNet/proxy support, and usage docs.

  • New Terraform modules
    • remote-ingestion-executor-aks/: AKS deployment via bundled Helm chart
      • Azure AD Workload Identity to ACR (azurerm_user_assigned_identity, federated creds, AcrPull role)
      • Kubernetes secrets for GMS token and optional registry creds; configurable resources/replicas/workers
      • Optional custom transformers via ConfigMap; proxy env support; templated Helm values; extensive outputs/debug cmds
    • remote-ingestion-executor-aca/: Azure Container Apps deployment (no Helm)
      • Creates/uses Container App Environment, optional Log Analytics, workload profiles (up to 4 CPU/8Gi)
      • User-assigned managed identity; ACR pull role; optional Key Vault secret reference
      • VNet integration and internal load balancer; proxy/CA certs via Azure Files; health probes, scaling, extra envs
  • Documentation & examples: Comprehensive READMEs, configuration references, troubleshooting, and example main.tf/terraform.tfvars.

Written by Cursor Bugbot for commit e2a2930. This will update automatically on new commits. Configure here.

Add comprehensive Terraform module for deploying DataHub Remote Executor
on Azure Kubernetes Service (AKS) with support for locked-down environments.

Key Features:
- Azure AD Workload Identity support for keyless ACR authentication
- Image pull secrets fallback for standard authentication
- HTTP/HTTPS proxy configuration for restricted networks
- Complete Helm chart included (no external registry required)
- Comprehensive documentation with 3 network connectivity options
- Production-ready with security best practices

Module Components:
- main.tf: Azure provider, AKS integration, Kubernetes resources
- variables.tf: 30+ configurable variables with validation
- outputs.tf: Debugging commands and network requirements
- helm-values.yaml.tpl: Templated Helm values with proxy support
- README.md: 720 lines of detailed documentation
- datahub-executor-helm/: Local Helm chart (v0.0.26)

Network Connectivity Options:
1. VPN/Azure ExpressRoute to AWS (private connectivity)
2. AWS PrivateLink for SQS access (private endpoints)
3. HTTPS proxy/API gateway (controlled egress)

Security Features:
- No long-lived AWS credentials on executor
- Secrets never transit through SQS
- All communication outbound (no inbound ports)
- GMS handles AWS STS credential operations
- Support for Azure Key Vault integration

Tested and verified on:
- AKS cluster: maxtest-aks (Kubernetes 1.32.9)
- DataHub: test-environment.acryl.io
- Image: datahubacr.azurecr.io/datahub-executor:v0.3.15.3-acryl
- Executor pool: aks-executor-pool
- Status: Successfully deployed and operational

Documentation includes:
- Architecture overview with security model
- Prerequisites and requirements
- Network connectivity options for locked-down environments
- ACR setup with private endpoints
- Azure AD Workload Identity configuration
- Comprehensive troubleshooting guide
- Example configurations for different scenarios

Fixes credential refresh architecture clarification:
- Executors do NOT need direct AWS STS access
- GMS handles all sts:AssumeRole operations
- Executors only need access to DataHub GMS and AWS SQS
Adds a new Terraform module for deploying the DataHub Remote Ingestion
Executor on Azure Container Apps as an alternative to AKS.

Key features:
- Native ACA deployment without Helm dependency
- Workload profile support for 4 CPU / 8Gi resources
- Managed Identity integration for ACR and Key Vault
- VNet integration for locked-down environments
- Proxy and custom CA certificate support
- Comprehensive documentation with troubleshooting guide
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

for_each = var.datahub_access_token == "" && var.key_vault_id != "" ? [1] : []
content {
name = "datahub-gms-token"
key_vault_secret_id = "${var.key_vault_id}/secrets/${var.key_vault_secret_name}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Key Vault secret ID uses wrong URL format

The key_vault_secret_id attribute is constructed by appending /secrets/ to var.key_vault_id, which is an ARM resource ID (like /subscriptions/.../Microsoft.KeyVault/vaults/vaultname). However, Azure Container Apps expects a Key Vault secret URI format like https://vaultname.vault.azure.net/secrets/secretname. This will cause deployments using Key Vault integration to fail with an invalid secret reference error.

Fix in Cursor Fix in Web

content {
name = "ca-certs-volume"
storage_type = "AzureFile"
storage_name = var.azure_files_share_name
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Volume storage_name references wrong value for Azure Files

The storage_name attribute in the volume block is set to var.azure_files_share_name (the Azure Files share name), but it should reference the name of the azurerm_container_app_environment_storage resource, which is "ca-certs-storage". Container Apps expects storage_name to match the name of a storage configured in the environment, not the underlying Azure Files share name. This will cause volume mount failures when using custom CA certificates.

Fix in Cursor Fix in Web

{{- end }}
{{- end }}
{{- if .Values.extraVolumes }}
{{ toYaml .Values.extraVolumes | nindent 8 }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Helm template indentation mismatch causes invalid YAML

The nindent values for extraVolumes, extraInitContainers, and extraVolumeMounts are inconsistent with the indentation of hardcoded list items. For example, hardcoded volumes at line 66 use 10 spaces, but extraVolumes | nindent 8 outputs at 8 spaces. When both are configured, this creates YAML with inconsistent list item indentation, causing template rendering failures or malformed Kubernetes manifests.

Additional Locations (2)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant