This directory contains the Terraform infrastructure code for the BidderGod C2C auction platform deployed on AWS ECS Fargate with a complete microservices architecture.
Internet → Kong API Gateway (ECS) → Microservices (ECS + Service Discovery)
→ Infrastructure Services (Kafka, DBs on ECS + EFS)
- VPC: Custom VPC with public subnets in 1 availability zone (cost-optimized)
- ECS Fargate Spot: Serverless containers with ~70% cost savings
- Kong API Gateway: Single entry point with load balancing, rate limiting, and authentication
- AWS Cloud Map: Private DNS service discovery (
biddergod-dev.local) - EFS: Persistent storage for stateful services (databases, Kafka)
- ECR: Private container registries for 17 microservices
- CloudWatch: Centralized logging (3-day retention)
- No ALB: Direct public IP access via Kong (saves $16/month)
- No NAT Gateway: Public subnets only (saves ~$32/month)
| Service | Port | Technology | Purpose |
|---|---|---|---|
| user-service | 8080 | Spring Boot | User management, authentication |
| auction-service | 4000 | Node.js/Express | Auction lifecycle management |
| bid-command | 8082 | Go/Gin | Write operations for bidding (CQRS) |
| bid-query | 8083 | Go/Gin | Read operations for bid history (CQRS) |
| auction-projector | 8084 | Go | Kafka consumer for auction events |
| bid-projector | 8085 | Go | Kafka consumer for bid events |
| payment-service | 3000 | NestJS | Stripe payment processing |
| sse-stream-service | 8086 | Node.js | Real-time event streaming to frontend |
| Service | Port | Technology | Purpose |
|---|---|---|---|
| kafka | 9092 | Confluent Kafka 7.9.4 | Event streaming (KRaft mode) |
| postgres | 5432 | PostgreSQL 18 | User, auction, and bidding data |
| mysql | 3306 | MySQL 8.0 | Payment service data |
| mongo | 27017 | MongoDB | Bid query read model (CQRS) |
| redis | 6379 | Redis 8 | Auction metadata cache |
| Service | Port | Technology | Purpose |
|---|---|---|---|
| kong | 8000 | Kong Gateway | API gateway with LB and rate limiting |
| prometheus | 9090 | Prometheus | Metrics collection |
| grafana | 3001 | Grafana | Metrics visualization |
| kafka-ui | 8080 | Provectus | Kafka management UI |
External Traffic:
Internet → Kong (public IP) → Backend services (via Cloud Map)
Internal Traffic:
Services use AWS Cloud Map DNS:
- kafka.biddergod-dev.local:9092
- postgres.biddergod-dev.local:5432
- redis.biddergod-dev.local:6379
Data Persistence:
- Databases and Kafka use EFS volumes
- Data survives task restarts
- Automatic backups via EFS snapshots (optional)
-
AWS CLI with configured credentials
aws configure
-
Terraform >= 1.9 (for S3 native locking)
terraform --version
-
Docker for building images
-
S3 Bucket for Terraform state (see setup below)
-
Stripe API Key (for payment service)
export TF_VAR_stripe_secret_key="sk_test_..."
# Create S3 bucket for Terraform state
aws s3api create-bucket \
--bucket biddergod-terraform-state \
--region ap-southeast-1 \
--create-bucket-configuration LocationConstraint=ap-southeast-1
# Enable versioning
aws s3api put-bucket-versioning \
--bucket biddergod-terraform-state \
--versioning-configuration Status=Enabled
# Enable encryption
aws s3api put-bucket-encryption \
--bucket biddergod-terraform-state \
--server-side-encryption-configuration '{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"AES256"}}]}'
# Block public access
aws s3api put-public-access-block \
--bucket biddergod-terraform-state \
--public-access-block-configuration "BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true"cd infrastructure
# Initialize Terraform
terraform init
# Review configuration
terraform plan
# Deploy (creates 17 ECS services, VPC, EFS, etc.)
terraform apply
# Get outputs
terraform outputNote: Initial deployment will fail because Docker images don't exist yet. ECS tasks will stay in PENDING state until you push images in step 3.
You need to build and push 17 images (excluding frontend which uses AWS Amplify):
# Get AWS account ID
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export AWS_REGION="ap-southeast-1"
# Authenticate Docker to ECR
aws ecr get-login-password --region $AWS_REGION | \
docker login --username AWS --password-stdin ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com
# Get ECR URLs from Terraform outputs
cd infrastructure
export USER_ECR=$(terraform output -raw user_service_ecr_url)
export AUCTION_ECR=$(terraform output -raw auction_service_ecr_url)
export BID_CMD_ECR=$(terraform output -raw bid_command_ecr_url)
export BID_QRY_ECR=$(terraform output -raw bid_query_ecr_url)
export AUC_PROJ_ECR=$(terraform output -raw auction_projector_ecr_url)
export BID_PROJ_ECR=$(terraform output -raw bid_projector_ecr_url)
export PAYMENT_ECR=$(terraform output -raw payment_service_ecr_url)
export SSE_ECR=$(terraform output -raw sse_stream_service_ecr_url)
export KONG_ECR=$(terraform output -raw kong_ecr_url)
cd ..
# Build and push each service
# 1. User Service
cd user-service
docker build -t user-service .
docker tag user-service:latest $USER_ECR:latest
docker push $USER_ECR:latest
cd ..
# 2. Auction Service
cd auction-service/src
docker build -t auction-service .
docker tag auction-service:latest $AUCTION_ECR:latest
docker push $AUCTION_ECR:latest
cd ../..
# 3-6. Bidding Services (4 services from one codebase)
cd bidding-service
docker build -f services/bid-command/Dockerfile -t bid-command .
docker tag bid-command:latest $BID_CMD_ECR:latest
docker push $BID_CMD_ECR:latest
docker build -f services/bid-query/Dockerfile -t bid-query .
docker tag bid-query:latest $BID_QRY_ECR:latest
docker push $BID_QRY_ECR:latest
docker build -f services/auction-projector/Dockerfile -t auction-projector .
docker tag auction-projector:latest $AUC_PROJ_ECR:latest
docker push $AUC_PROJ_ECR:latest
docker build -f services/bid-projector/Dockerfile -t bid-projector .
docker tag bid-projector:latest $BID_PROJ_ECR:latest
docker push $BID_PROJ_ECR:latest
cd ..
# 7. Payment Service
cd payment-service
docker build -t payment-service .
docker tag payment-service:latest $PAYMENT_ECR:latest
docker push $PAYMENT_ECR:latest
cd ..
# 8. SSE Stream Service
cd sse-stream-service
docker build -t sse-stream-service .
docker tag sse-stream-service:latest $SSE_ECR:latest
docker push $SSE_ECR:latest
cd ..
# 9. Kong API Gateway
cd docker-compose/config # Assuming Kong config is here
docker build -t kong .
docker tag kong:latest $KONG_ECR:latest
docker push $KONG_ECR:latest
cd ../..# Check ECS service status
aws ecs list-services --cluster biddergod-dev-cluster --region ap-southeast-1
# Check running tasks
aws ecs describe-services \
--cluster biddergod-dev-cluster \
--services biddergod-dev-user-service \
--region ap-southeast-1 \
--query 'services[0].{Running:runningCount,Desired:desiredCount,Status:status}'
# View logs
aws logs tail /ecs/biddergod-dev-user-service --follow --region ap-southeast-1
# Get Kong public IP
aws ecs describe-tasks \
--cluster biddergod-dev-cluster \
--tasks $(aws ecs list-tasks --cluster biddergod-dev-cluster --service-name biddergod-dev-kong --query 'taskArns[0]' --output text) \
--query 'tasks[0].attachments[0].details[?name==`networkInterfaceId`].value' \
--output text | xargs -I {} aws ec2 describe-network-interfaces --network-interface-ids {} --query 'NetworkInterfaces[0].Association.PublicIp' --output text
# Test via Kong
curl http://<kong-public-ip>:8000/healthECS Fargate Spot Costs:
- 17 services @ 256 CPU, 512-1024 MB each
- Kafka (512 CPU, 1024 MB) - highest resource
- Estimated: ~$0.30-0.40/hour = $216-288/month
Additional AWS Costs:
- EFS storage (5-10 GB): $1.50-3/month
- ECR storage (10 GB): $1/month
- CloudWatch Logs (3-day retention): $5-10/month
- Data transfer: $5-10/month
Total Estimated Cost: $228-312/month
Cost Savings vs ALB Architecture:
- No ALB: Saves $16/month
- No NAT Gateway: Saves $32/month
- Fargate Spot: Saves ~70% vs on-demand
- Single AZ: Reduces cross-AZ data transfer
To Minimize Costs:
# Stop all services when not in use
aws ecs update-service --cluster biddergod-dev-cluster --service biddergod-dev-<service-name> --desired-count 0 --region ap-southeast-1
# Or destroy everything
terraform destroyVPC Configuration:
- CIDR:
10.0.0.0/16 - Single AZ:
ap-southeast-1a - Public subnets only (no private subnets)
- Internet Gateway for ECR/internet access
- NO NAT Gateway (cost optimization)
Security Groups:
- ECS Tasks SG: Allows self-referencing (all ports) + specific ports from internet (80, 443, 8000 for Kong)
- Kong: Exposed to internet on ports 80, 443, 8000, 8001
- Other services: Only accessible internally via Cloud Map DNS
AWS Cloud Map provides DNS-based service discovery:
- Namespace:
biddergod-dev.local - Example:
kafka.biddergod-dev.local,postgres.biddergod-dev.local - TTL: 10 seconds
- Health checks: ECS task health status
EFS File System:
- Performance mode: General Purpose
- Throughput mode: Bursting
- Encryption at rest: Yes
- Mount targets in public subnet
EFS Access Points (one per stateful service):
/postgres-data- PostgreSQL data directory/mysql-data- MySQL data directory/mongo-data- MongoDB data directory/redis-data- Redis AOF persistence/kafka-data- Kafka logs/prometheus-data- Prometheus time series/grafana-data- Grafana dashboards
Each microservice supports auto-scaling:
- Target tracking on CPU (default: 70%)
- Target tracking on memory (default: 80%)
- Min: 1 task, Max: 5 tasks (configurable)
- Currently disabled (
enable_auto_scaling = false)
To enable auto-scaling:
# In main.tf
module "user_service" {
# ...
enable_auto_scaling = true
auto_scaling_min_capacity = 1
auto_scaling_max_capacity = 5
auto_scaling_cpu_target = 70
auto_scaling_memory_target = 80
}View logs for any service:
aws logs tail /ecs/biddergod-dev-<service-name> --follow --region ap-southeast-1Access Grafana dashboard:
- Get Grafana public IP (similar to Kong IP retrieval)
- Open http://:3001
- Login: admin/admin
- Datasource: Pre-configured Prometheus
Available Metrics:
- Go service metrics (bid-command, bid-query, projectors)
- Custom business metrics (bid counts, auction states)
- System metrics (CPU, memory, network)
terraform output # View all outputsKey Outputs:
ecs_cluster_name- ECS cluster nameecs_cluster_arn- ECS cluster ARNvpc_id- VPC IDefs_file_system_id- EFS file system IDservice_discovery_namespace- Cloud Map namespace*_ecr_url- ECR repository URLs for each service
# Rebuild and push new image
cd user-service
docker build -t user-service .
docker tag user-service:latest $USER_ECR:latest
docker push $USER_ECR:latest
# Force new deployment
aws ecs update-service \
--cluster biddergod-dev-cluster \
--service biddergod-dev-user-service \
--force-new-deployment \
--region ap-southeast-1# Scale up
aws ecs update-service \
--cluster biddergod-dev-cluster \
--service biddergod-dev-bid-command \
--desired-count 3 \
--region ap-southeast-1aws ecs describe-services \
--cluster biddergod-dev-cluster \
--services biddergod-dev-user-service \
--region ap-southeast-1 \
--query 'services[0].events[0:5]'# Get database task private IP
aws ecs describe-tasks \
--cluster biddergod-dev-cluster \
--tasks $(aws ecs list-tasks --cluster biddergod-dev-cluster --service-name biddergod-dev-postgres --query 'taskArns[0]' --output text) \
--query 'tasks[0].attachments[0].details[?name==`privateIPv4Address`].value' \
--output text
# Connect via bastion or VPN
psql -h <postgres-private-ip> -U postgres -d postgresCheck task stopped reason:
aws ecs describe-tasks \
--cluster biddergod-dev-cluster \
--tasks <task-arn> \
--region ap-southeast-1 \
--query 'tasks[0].stoppedReason'Common issues:
- Image not in ECR: Build and push image first
- Health check failing: Check application health endpoint
- Missing environment variables: Check task definition
- EFS mount failing: Check security groups allow NFS (port 2049)
- Resource limits: Increase CPU/memory in Terraform
Check EFS mount targets:
aws efs describe-mount-targets \
--file-system-id $(terraform output -raw efs_file_system_id) \
--region ap-southeast-1Check security group:
# ECS tasks SG must allow inbound NFS from itself
aws ec2 describe-security-groups \
--group-ids $(terraform output -raw ecs_security_group_id) \
--region ap-southeast-1Test DNS resolution from a running task:
# Get a task ARN
TASK_ARN=$(aws ecs list-tasks --cluster biddergod-dev-cluster --service-name biddergod-dev-user-service --query 'taskArns[0]' --output text)
# Execute command in task
aws ecs execute-command \
--cluster biddergod-dev-cluster \
--task $TASK_ARN \
--container user-service \
--command "nslookup kafka.biddergod-dev.local" \
--interactive \
--region ap-southeast-1# Stop all services (doesn't delete infrastructure)
for service in user-service auction-service bid-command bid-query auction-projector bid-projector payment-service sse-stream-service kong postgres mysql mongo redis kafka prometheus grafana; do
aws ecs update-service \
--cluster biddergod-dev-cluster \
--service biddergod-dev-$service \
--desired-count 0 \
--region ap-southeast-1
donecd infrastructure
# Preview what will be destroyed
terraform plan -destroy
# Destroy everything
terraform destroy
# Note: EFS data will be deleted. Backup if needed!Warning: This deletes:
- All ECS services and tasks
- EFS file system (all database data!)
- ECR repositories (if empty)
- VPC, subnets, security groups
- CloudWatch log groups
- Service discovery namespace
This infrastructure is optimized for development/demo purposes. For production:
-
High Availability:
- Deploy across 2-3 AZs
- Increase desired count for services
- Use RDS Multi-AZ instead of PostgreSQL/MySQL on ECS
- Use AWS MSK instead of self-hosted Kafka
- Use ElastiCache instead of Redis on ECS
- Use DocumentDB instead of MongoDB on ECS
-
Security:
- Enable private subnets with NAT Gateway
- Use AWS Secrets Manager for sensitive env vars
- Enable VPC Flow Logs
- Configure WAF rules for Kong
- Use TLS/SSL certificates
- Restrict security groups further
-
Monitoring:
- Enable Container Insights
- Set up CloudWatch alarms
- Configure log aggregation (e.g., ELK stack)
- Enable X-Ray tracing
- Increase log retention
-
Scalability:
- Enable auto-scaling for all services
- Configure appropriate CPU/memory targets
- Use Application Auto Scaling for databases
- Implement caching strategies
- Consider API rate limiting at Kong
-
Cost Optimization:
- Use Savings Plans or Reserved Instances
- Right-size resources after load testing
- Implement lifecycle policies for logs/images
- Use Fargate on-demand for critical services
- Enable EFS Infrequent Access tier
- AWS ECS Best Practices
- Kong Gateway Documentation
- AWS Cloud Map Service Discovery
- EFS Performance
- Terraform AWS Provider
- Project Main README
- Docker Compose Setup
For issues:
- Check CloudWatch logs first
- Review ECS service events
- Verify security groups and network connectivity
- Check service discovery DNS resolution
- Validate environment variables
Common commands reference:
# List all services
aws ecs list-services --cluster biddergod-dev-cluster --region ap-southeast-1
# Describe service
aws ecs describe-services --cluster biddergod-dev-cluster --services biddergod-dev-<service-name> --region ap-southeast-1
# List tasks
aws ecs list-tasks --cluster biddergod-dev-cluster --service-name biddergod-dev-<service-name> --region ap-southeast-1
# Describe task
aws ecs describe-tasks --cluster biddergod-dev-cluster --tasks <task-arn> --region ap-southeast-1
# Tail logs
aws logs tail /ecs/biddergod-dev-<service-name> --follow --region ap-southeast-1
# View Terraform state
terraform show
# View specific output
terraform output <output-name>