Tech Stack:
- Python 3.11 + Flask
- Docker
- Google Kubernetes Engine (GKE)
- Google Container Registry (GCR)
- Cloud Build (CI)
- Horizontal Pod Autoscaler (HPA)
- GCE Ingress (Load Balancer)
microservices-gke/
│
├── user-service/ # Flask code + Dockerfile
│ ├── app.py
│ ├── requirements.txt
│ └── Dockerfile
│
├── order-service/ # Flask code + Dockerfile
│ ├── app.py
│ ├── requirements.txt
│ └── Dockerfile
│
├── k8s/ # Kubernetes manifests
│ ├── user-deployment.yaml
│ ├── order-deployment.yaml
│ ├── user-service.yaml
│ ├── order-service.yaml
│ ├── ingress.yaml
│ └── hpa.yaml
│
├── cloudbuild.yaml # CI pipeline for building & deploying
└── README.md
-
Two Flask Microservices
/users→ returns["alice", "bob", "carol"]/orders→ returns a static list of order objects
Both services run on port 8080 inside a container.
-
Containerization & Registry
- Dockerfiles build each service into images:
gcr.io/$PROJECT_ID/user-service:v1gcr.io/$PROJECT_ID/order-service:v1
Replace$PROJECT_IDwith your actual GCP project ID (e.g.my-gke-microservices).
- Pushed to Google Container Registry (GCR).
- Dockerfiles build each service into images:
-
Google Kubernetes Engine
- Created a 2-node GKE cluster (
microservices-clusterinus-central1-a). - Deployed services via Kubernetes Deployments & Services.
- CPU Requests (100 mCPU) in each Deployment so HPA can calculate CPU%:
resources: requests: cpu: 100m
- Health checks (Liveness & Readiness) configured under
/usersand/orders.
- Created a 2-node GKE cluster (
-
Ingress & Load Balancing
- A single GCE Ingress routes:
/users→user-service/orders→order-service
- External IP is automatically provisioned by GKE.
- A single GCE Ingress routes:
-
Autoscaling
- Two HPAs scale
user-deploymentandorder-deploymentfrom 2 to 5 replicas based on CPU ≥ 50%. - Tip: If
kubectl get hpastill showscpu: <unknown>/50%, install Metrics Server:kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
- Two HPAs scale
-
CI/CD with Cloud Build
cloudbuild.yamlbuilds Docker images on every commit, pushes to GCR, and runskubectl applyto update Deployments with a unique image tag.- Local builds tag with
v1, but Cloud Build tags with$BUILD_ID(or$SHORT_SHAwhen Git-triggered) so each commit yields a new image.
-
A GCP project (
YOUR_PROJECT_ID) with Billing enabled.Replace
YOUR_PROJECT_IDbelow with your actual project ID (e.g.my-gke-microservices). -
Enabled APIs:
- Container Registry (
containerregistry.googleapis.com) - Container Engine (
container.googleapis.com) - Cloud Build (
cloudbuild.googleapis.com) - IAM (
iam.googleapis.com)
- Container Registry (
-
Installed and authenticated gcloud CLI:
gcloud auth login gcloud config set project YOUR_PROJECT_ID
-
Authenticate Docker to GCR:
gcloud auth configure-docker
-
Build & push
user-service(tagv1locally):docker build -t gcr.io/$PROJECT_ID/user-service:v1 ./user-service docker push gcr.io/$PROJECT_ID/user-service:v1
-
Build & push
order-service(tagv1locally):docker build -t gcr.io/$PROJECT_ID/order-service:v1 ./order-service docker push gcr.io/$PROJECT_ID/order-service:v1
export PROJECT_ID=YOUR_PROJECT_ID
export CLUSTER_NAME=microservices-cluster
export ZONE=us-central1-a
gcloud config set project $PROJECT_ID
gcloud container clusters create $CLUSTER_NAME \
--zone=$ZONE \
--num-nodes=2 \
--machine-type=n1-standard-1
gcloud container clusters get-credentials $CLUSTER_NAME --zone $ZONEkubectl apply -f k8s/user-deployment.yaml
kubectl apply -f k8s/user-service.yaml
kubectl apply -f k8s/order-deployment.yaml
kubectl apply -f k8s/order-service.yaml
kubectl apply -f k8s/ingress.yaml
kubectl apply -f k8s/hpa.yamlWait for Ingress to get an external IP:
kubectl get ingress microservices-ingress -wWhen you see an ADDRESS assigned (e.g. 34.123.45.67), test:
curl http://34.123.45.67/users
curl http://34.123.45.67/ordersVerify HPA can see CPU percentages:
kubectl get hpaYou should see something like:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
user-hpa Deployment/user-deployment cpu: 2%/50% 2 5 2 10m
order-hpa Deployment/order-deployment cpu: 1%/50% 2 5 2 10m
To drive scaling, run a heavier load (for example, 2,000 requests with 50 concurrent clients):
hey -n 2000 -c 50 http://34.123.45.67/usersWatch HPA:
kubectl get hpa -wYou’ll see REPLICAS increase (2 → 3 → … up to 5) as CPU usage exceeds 50%, then scale back to 2 when load subsides.
-
Grant Cloud Build Service Account Permissions In IAM & Admin → IAM, grant
[email protected]these roles:- Storage Admin (
roles/storage.admin) - Kubernetes Engine Developer (
roles/container.developer) - Kubernetes Engine Cluster Viewer (
roles/container.clusterViewer)
- Storage Admin (
-
Test with a Manual Build (tags images with
$BUILD_ID):gcloud builds submit --config=cloudbuild.yaml \ --substitutions=_CLUSTER_NAME="microservices-cluster",_ZONE="us-central1-a"
- This builds/pushes two images tagged
gcr.io/$PROJECT_ID/user-service:$BUILD_IDand.../order-service:$BUILD_ID, then updates deployments.
- This builds/pushes two images tagged
-
Automate via Git Trigger In the Cloud Build console → Triggers → “Create Trigger”:
- Connect to your GitHub repo.
- Trigger on push to
main. - Select cloudbuild.yaml.
Now every push to
mainruns the same steps (using$SHORT_SHAas the tag if Git-triggered).
┌─────────────────────────────┐
│ External Clients │
│ [curl, browser, Postman] │
└────────────┬────────────────┘
│
┌────────────▼─────────────┐
│ GKE Ingress (GCE LB) │
│ routes /users → user-svc │
│ /orders → order-svc│
└────────────┬─────────────┘
│
┌────────────────▼─────────────────┐
│ user-service (port 80) │
│ Replicas: 2 → HPA (2–5 pods) │
└────────────────┬─────────────────┘
│
┌────────────────▼─────────────────┐
│ order-service (port 80) │
│ Replicas: 2 → HPA (2–5 pods) │
└────────────────┬─────────────────┘
│
┌────────────▼────────────────┐
│ Google Kubernetes Engine │
│ (2 × n1-standard-1 nodes) │
└─────────────────────────────┘