Conversation
Add comprehensive Terraform module for deploying DataHub Remote Executor on Azure Kubernetes Service (AKS) with support for locked-down environments. Key Features: - Azure AD Workload Identity support for keyless ACR authentication - Image pull secrets fallback for standard authentication - HTTP/HTTPS proxy configuration for restricted networks - Complete Helm chart included (no external registry required) - Comprehensive documentation with 3 network connectivity options - Production-ready with security best practices Module Components: - main.tf: Azure provider, AKS integration, Kubernetes resources - variables.tf: 30+ configurable variables with validation - outputs.tf: Debugging commands and network requirements - helm-values.yaml.tpl: Templated Helm values with proxy support - README.md: 720 lines of detailed documentation - datahub-executor-helm/: Local Helm chart (v0.0.26) Network Connectivity Options: 1. VPN/Azure ExpressRoute to AWS (private connectivity) 2. AWS PrivateLink for SQS access (private endpoints) 3. HTTPS proxy/API gateway (controlled egress) Security Features: - No long-lived AWS credentials on executor - Secrets never transit through SQS - All communication outbound (no inbound ports) - GMS handles AWS STS credential operations - Support for Azure Key Vault integration Tested and verified on: - AKS cluster: maxtest-aks (Kubernetes 1.32.9) - DataHub: test-environment.acryl.io - Image: datahubacr.azurecr.io/datahub-executor:v0.3.15.3-acryl - Executor pool: aks-executor-pool - Status: Successfully deployed and operational Documentation includes: - Architecture overview with security model - Prerequisites and requirements - Network connectivity options for locked-down environments - ACR setup with private endpoints - Azure AD Workload Identity configuration - Comprehensive troubleshooting guide - Example configurations for different scenarios Fixes credential refresh architecture clarification: - Executors do NOT need direct AWS STS access - GMS handles all sts:AssumeRole operations - Executors only need access to DataHub GMS and AWS SQS
Adds a new Terraform module for deploying the DataHub Remote Ingestion Executor on Azure Container Apps as an alternative to AKS. Key features: - Native ACA deployment without Helm dependency - Workload profile support for 4 CPU / 8Gi resources - Managed Identity integration for ACR and Key Vault - VNet integration for locked-down environments - Proxy and custom CA certificate support - Comprehensive documentation with troubleshooting guide
There was a problem hiding this comment.
This PR is being reviewed by Cursor Bugbot
Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| for_each = var.datahub_access_token == "" && var.key_vault_id != "" ? [1] : [] | ||
| content { | ||
| name = "datahub-gms-token" | ||
| key_vault_secret_id = "${var.key_vault_id}/secrets/${var.key_vault_secret_name}" |
There was a problem hiding this comment.
Bug: Key Vault secret ID uses wrong URL format
The key_vault_secret_id attribute is constructed by appending /secrets/ to var.key_vault_id, which is an ARM resource ID (like /subscriptions/.../Microsoft.KeyVault/vaults/vaultname). However, Azure Container Apps expects a Key Vault secret URI format like https://vaultname.vault.azure.net/secrets/secretname. This will cause deployments using Key Vault integration to fail with an invalid secret reference error.
| content { | ||
| name = "ca-certs-volume" | ||
| storage_type = "AzureFile" | ||
| storage_name = var.azure_files_share_name |
There was a problem hiding this comment.
Bug: Volume storage_name references wrong value for Azure Files
The storage_name attribute in the volume block is set to var.azure_files_share_name (the Azure Files share name), but it should reference the name of the azurerm_container_app_environment_storage resource, which is "ca-certs-storage". Container Apps expects storage_name to match the name of a storage configured in the environment, not the underlying Azure Files share name. This will cause volume mount failures when using custom CA certificates.
| {{- end }} | ||
| {{- end }} | ||
| {{- if .Values.extraVolumes }} | ||
| {{ toYaml .Values.extraVolumes | nindent 8 }} |
There was a problem hiding this comment.
Bug: Helm template indentation mismatch causes invalid YAML
The nindent values for extraVolumes, extraInitContainers, and extraVolumeMounts are inconsistent with the indentation of hardcoded list items. For example, hardcoded volumes at line 66 use 10 spaces, but extraVolumes | nindent 8 outputs at 8 spaces. When both are configured, this creates YAML with inconsistent list item indentation, causing template rendering failures or malformed Kubernetes manifests.
Add Azure Remote Ingestion Executor Terraform Modules
Summary
This PR adds two new Terraform modules for deploying the DataHub Remote Ingestion Executor on Azure:
remote-ingestion-executor-aks- Deploy on Azure Kubernetes Service (AKS) using Helmremote-ingestion-executor-aca- Deploy on Azure Container Apps (serverless)These modules complement the existing AWS ECS module (
remote-ingestion-executor), enabling customers to run remote executors in their Azure environments.Changes
New Module:
remote-ingestion-executor-aksDeploys the Remote Executor on AKS using the bundled Helm chart.
Features:
New Module:
remote-ingestion-executor-acaDeploys the Remote Executor on Azure Container Apps as a serverless alternative.
Features:
Comparison: AKS vs ACA
Test Plan
terraform planvalidates successfullyterraform applydeploys successfullyFiles Changed
Usage Examples
AKS Deployment
ACA Deployment
Related Documentation
Add Azure Remote Ingestion Executor Terraform Modules
Summary
This PR adds two new Terraform modules for deploying the DataHub Remote Ingestion Executor on Azure:
remote-ingestion-executor-aks- Deploy on Azure Kubernetes Service (AKS) using Helmremote-ingestion-executor-aca- Deploy on Azure Container Apps (serverless)These modules complement the existing AWS ECS module (
remote-ingestion-executor), enabling customers to run remote executors in their Azure environments.Changes
New Module:
remote-ingestion-executor-aksDeploys the Remote Executor on AKS using the bundled Helm chart.
Features:
New Module:
remote-ingestion-executor-acaDeploys the Remote Executor on Azure Container Apps as a serverless alternative.
Features:
Comparison: AKS vs ACA
Test Plan
terraform planvalidates successfullyterraform applydeploys successfullyFiles Changed
Usage Examples
AKS Deployment
ACA Deployment
Related Documentation
Note
Adds Terraform modules for deploying the DataHub Remote Ingestion Executor on Azure AKS (Helm-based) and ACA (serverless) with managed identity/ACR, Key Vault, VNet/proxy support, and usage docs.
remote-ingestion-executor-aks/: AKS deployment via bundled Helm chartazurerm_user_assigned_identity, federated creds,AcrPullrole)remote-ingestion-executor-aca/: Azure Container Apps deployment (no Helm)main.tf/terraform.tfvars.Written by Cursor Bugbot for commit e2a2930. This will update automatically on new commits. Configure here.