This Terraform configuration deploys an Amazon EKS cluster optimized for StarRocks and ClickHouse workloads in AWS China regions (cn-north-1 or cn-northwest-1). The deployment includes Karpenter for auto-scaling, monitoring stack with Prometheus and Grafana, the StarRocks operator, and the ClickHouse operator.
- EKS Cluster: Kubernetes 1.33 with managed node groups (m6i.2xlarge x4)
- Karpenter: Auto-scaling with Graviton compute node pools
- StarRocks Operator: Custom operator for managing StarRocks clusters
- ClickHouse Operator: Altinity ClickHouse operator for managing ClickHouse clusters
- Monitoring: Kube-Prometheus stack with Grafana dashboards
- Storage: EBS CSI driver with GP3 encrypted storage
- Load Balancing: AWS Load Balancer Controller
- Region: Configured for cn-north-1/cn-northwest-1
- ECR Images: Uses public ECR images accessible from China
- Registry IDs: Correct account IDs for China regions
- aws-cloudwatch-metrics
- cluster-proportional-autoscaler
- cluster-autoscaler-aws-cluster-autoscaler
- aws-for-fluent-bit
- kubecost
- eks-pod-identity-agent
- Karpenter: Auto-scaling with Graviton compute node pools
- Metrics Server: Custom image from public.ecr.aws/bitnami/metrics-server:0.8.0
- EBS CSI Driver: For persistent storage
- AWS Load Balancer Controller: For ingress and load balancing
- StarRocks Operator: v1.10.2 from public.ecr.aws/dong-registry with embedded CRD
- ClickHouse Operator: v0.25.2 from public.ecr.aws/altinity with full CRD support
- Instance Type: m6i.2xlarge (8 vCPU, 32 GiB RAM)
- Node Count: 4 nodes (min: 4, max: 8, desired: 4)
- Storage: 100GB GP3 root volumes
- Graviton Compute: ARM64 instances (c6g, c7g, m6g, m7g, r6g, r7g families)
- Auto-scaling: Spot and On-Demand instances
- Instance Store: RAID0 configuration for high performance
- AWS CLI: Configured with China region credentials
- Terraform: Version >= 1.3.2
- kubectl: For cluster management
- Permissions: EKS, VPC, IAM, and ECR permissions
cp terraform.tfvars.example terraform.tfvarsEdit terraform.tfvars with your specific configuration:
# Basic Configuration
name = "starrocks-eks-karpenter"
region = "cn-north-1" # or cn-northwest-1
# EKS Configuration
eks_cluster_version = "1.33"
# StarRocks Configuration
starrocks_namespace = "starrocks"
starrocks_operator_image = "public.ecr.aws/dong-registry/starrocks-operator:v1.10.2"
# ClickHouse Configuration
enable_clickhouse_operator = true
clickhouse_namespace = "clickhouse"
clickhouse_operator_image = "public.ecr.aws/altinity/clickhouse-operator:0.25.2"
clickhouse_metrics_exporter_image = "public.ecr.aws/altinity/metrics-exporter:0.25.2"
# Tags
tags = {
Environment = "dev"
Project = "starrocks-eks"
Owner = "platform-team"
}# Initialize Terraform
terraform init
# Plan the deployment
terraform plan
# Apply the configuration
terraform apply# Update kubeconfig (use the output from terraform apply)
aws eks --region cn-north-1 update-kubeconfig --name starrocks-eks-karpenter# Check cluster status
kubectl get nodes
# Check StarRocks operator
kubectl get deployment kube-starrocks-operator -n starrocks
# Check ClickHouse operator
kubectl get deployment clickhouse-operator -n kube-system
# Check Karpenter
kubectl get deployment karpenter -n karpenter
# Check monitoring stack
kubectl get pods -n monitoring
# Check all namespaces
kubectl get pods --all-namespacesAfter the infrastructure is deployed, you can deploy StarRocks clusters using the operator:
# Create a StarRocks cluster (example)
cat <<EOF | kubectl apply -f -
apiVersion: starrocks.com/v1
kind: StarRocksCluster
metadata:
name: starrocks-cluster
namespace: starrocks
spec:
starRocksFeSpec:
image: starrocks/fe-ubuntu:3.2-latest
replicas: 1
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "2"
memory: "4Gi"
starRocksBeSpec:
image: starrocks/be-ubuntu:3.2-latest
replicas: 3
requests:
cpu: "2"
memory: "4Gi"
limits:
cpu: "4"
memory: "8Gi"
storageVolumes:
- name: be-storage
storageClassName: gp3
storageSize: 100Gi
EOFAfter the infrastructure is deployed, you can deploy ClickHouse clusters using the operator with Karpenter node pools:
# Create a ClickHouse cluster using Karpenter Graviton nodes
cat <<EOF | kubectl apply -f -
apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
name: clickhouse-cluster
namespace: clickhouse
spec:
useTemplates:
- name: karpenter-graviton-pod-template
- name: karpenter-storage-template-10Gi
configuration:
clusters:
- name: "cluster"
layout:
shardsCount: 1
replicasCount: 1
users:
admin/password: admin123
admin/networks/ip:
- "0.0.0.0/0"
templates:
podTemplates:
- name: clickhouse-karpenter-pod-template
spec:
nodeSelector:
type: karpenter
provisioner: graviton-compute
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:23.8
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "8Gi"
cpu: "4000m"
volumeClaimTemplates:
- name: data-volume-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: gp3
defaults:
templates:
podTemplate: clickhouse-karpenter-pod-template
dataVolumeClaimTemplate: data-volume-template
EOFYou can also use the pre-built Karpenter templates:
# Create a ClickHouse cluster using pre-built Karpenter templates
cat <<EOF | kubectl apply -f -
apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
name: clickhouse-cluster-simple
namespace: clickhouse
spec:
useTemplates:
- name: karpenter-graviton-pod-template
- name: karpenter-storage-template-10Gi
configuration:
clusters:
- name: "cluster"
layout:
shardsCount: 1
replicasCount: 1
users:
admin/password: admin123
admin/networks/ip:
- "0.0.0.0/0"
EOFterraform-starrocks-eks/
├── main.tf # Main Terraform configuration
├── variables.tf # Variable definitions
├── versions.tf # Provider versions
├── vpc.tf # VPC configuration
├── eks.tf # EKS cluster configuration
├── addons.tf # Kubernetes addons and StarRocks operator
├── clickhouse-operator.tf # ClickHouse operator configuration
├── outputs.tf # Output values
├── terraform.tfvars.example # Example variable values
└── README.md # This file
The Graviton compute node pool can be customized in addons.tf:
# Modify instance families, sizes, or limits
karpenter_resources_helm_config = {
graviton-compute = {
values = [
<<-EOT
# Customize instance types, limits, etc.
EOT
]
}
}Modify the operator deployment in addons.tf:
# Update image, resources, or security settings
resource "kubernetes_deployment" "starrocks_operator" {
# Configuration here
}The monitoring stack (kube-prometheus-stack) has been disabled. If monitoring is needed, it can be enabled by setting enable_kube_prometheus_stack = true in the EKS blueprints addons configuration.
# Test ECR public access
docker pull public.ecr.aws/dong-registry/starrocks-operator:v1.10.2# Check Karpenter logs
kubectl logs -n karpenter deployment/karpenter
# Check node pools
kubectl get nodepools
kubectl get ec2nodeclasses# Check operator logs
kubectl logs -n starrocks deployment/kube-starrocks-operator
# Check RBAC permissions
kubectl auth can-i create starrocksclusters --as=system:serviceaccount:starrocks:starrocks# Check operator logs
kubectl logs -n kube-system deployment/clickhouse-operator
# Check ClickHouse installations
kubectl get chi -A
# Check ClickHouse operator status
kubectl get pods -n kube-system -l app=clickhouse-operator# Check prometheus operator
kubectl get pods -n monitoring | grep prometheus-operator
# Check CRDs
kubectl get crd | grep monitoring# Cluster Status
kubectl get nodes -o wide
kubectl top nodes
# StarRocks Status
kubectl get starrocksclusters -n starrocks
kubectl get pods -n starrocks
# ClickHouse Status
kubectl get chi -n clickhouse
kubectl get pods -n clickhouse
# Karpenter Status
kubectl get nodepools
kubectl describe nodepool graviton-compute
# Monitoring Status
kubectl get pods -n monitoring
kubectl get svc -n monitoringTo remove all resources:
# Delete StarRocks clusters first
kubectl delete starrocksclusters --all -n starrocks
# Delete ClickHouse clusters first
kubectl delete chi --all -n clickhouse
# Wait for cleanup
kubectl get pods -n starrocks --watch
kubectl get pods -n clickhouse --watch
# Destroy Terraform resources
terraform destroy- Node Security: All containers run as non-root with read-only filesystems
- RBAC: Minimal required permissions for each component
- Network: Private subnets with NAT gateway for outbound access
- Storage: Encrypted EBS volumes with GP3 performance
- Secrets: Grafana password stored in AWS Secrets Manager
- Instance Types: m6i.2xlarge for consistent performance
- Storage: GP3 with encryption for optimal I/O
- Networking: Secondary CIDR blocks for pod networking
- Auto-scaling: Karpenter with Graviton instances for cost optimization
For issues related to:
- Terraform: Check
terraform planoutput and state files - EKS: Verify cluster health and node group status
- StarRocks: Check operator logs and cluster status
- Monitoring: Verify prometheus operator and CRDs
- Terraform: >= 1.3.2
- EKS: 1.33
- Kubernetes: 1.33
- StarRocks Operator: v1.10.2
- Karpenter: 1.2.1
- Metrics Server: 0.8.0