KubeHalo is a Kubernetes autoscaling prototype written in Go. It introduces a ScalePolicy custom resource and a small control plane around it:
- a controller that watches
ScalePolicyobjects - a Prometheus-backed scaling decision flow
- a validation webhook for admission checks
- a lightweight HTTP API for listing policies
The codebase is now aligned around one ScalePolicy contract, has runnable entrypoints under cmd/, and includes automated tests for the critical scaling and validation paths.
Each ScalePolicy points at a workload, defines the metric query to evaluate, and describes how aggressively KubeHalo may scale up or down.
Current behavior:
- watches
ScalePolicyresources through a dynamic informer - queries Prometheus for the configured metric
- reads current Deployment replicas from the cluster
- computes desired replicas from metric threshold and step sizes
- applies optional behavior rules such as stabilization windows and rate caps
- updates the target Deployment when a scale action is needed
.
├── api/kubehalo/v1 # Typed ScalePolicy API definitions
├── cmd/api # HTTP API entrypoint
├── cmd/controller # Controller entrypoint
├── cmd/webhook # Admission webhook entrypoint
├── config/crd # CRD manifests
├── config/webhook # Webhook registration manifest
├── controllers/scalepolicy # Informer, handler, parsing, scaling logic
├── internal/config # Environment-based runtime configuration
├── internal/kube # Kubernetes client construction
├── internal/metrics # Prometheus client wrapper
├── internal/scaling # Deployment scaling engine
└── manifests # Example Kubernetes manifests
apiVersion: kubehalo.sh/v1
kind: ScalePolicy
metadata:
name: demo-policy
namespace: default
spec:
targetRef:
kind: Deployment
name: my-deployment
namespace: default
metric:
name: cpu
query: rate(container_cpu_usage_seconds_total[1m])
threshold: 0.8
scaleUp:
step: 2
cooldownSeconds: 60
scaleDown:
step: 1
cooldownSeconds: 120
minReplicas: 1
maxReplicas: 10
schedules:
- name: weekday-business-hours
days: ["Mon", "Tue", "Wed", "Thu", "Fri"]
startTime: "09:00"
endTime: "18:00"
minReplicas: 3
maxReplicas: 10
behavior:
stabilizationWindowSeconds: 60
maxScaleUpRate: 2
maxScaleDownRate: 1
policy: absoluteSample manifest: manifests/sample-policy.yaml
- Go 1.24+
- access to a Kubernetes cluster or local cluster such as Minikube
- a reachable Prometheus instance
- a valid
KUBECONFIGwhen running outside the cluster
kubectl apply -f config/crd/scale_policy.yamlexport KUBEHALO_PROMETHEUS_ADDR=http://localhost:9090
go run ./cmd/controllerkubectl apply -f manifests/sample-policy.yamlHTTP API:
go run ./cmd/apiWebhook:
go run ./cmd/webhookKubeHalo reads runtime configuration from environment variables.
| Variable | Default | Used By |
|---|---|---|
KUBEHALO_PROMETHEUS_ADDR |
http://localhost:9090 |
controller, webhook |
KUBEHALO_API_ADDR |
:8080 |
API server |
KUBEHALO_WEBHOOK_ADDR |
:8443 |
webhook server |
KUBEHALO_WEBHOOK_CERT_FILE |
/tls/tls.crt |
webhook server |
KUBEHALO_WEBHOOK_KEY_FILE |
/tls/tls.key |
webhook server |
Useful commands:
make fmt
make lint
make test
make run-controller
make run-api
make run-webhookThe Makefile keeps Go build caches inside .cache/, which makes local iteration cleaner and avoids polluting global caches.
The repository includes tests for:
- controller helper construction
ScalePolicyparsing and validation- handler-driven scaling decisions
- Deployment scaling engine behavior
- webhook admission validation
The webhook currently validates:
- required fields and logical replica bounds
- non-negative metric thresholds
- invalid Prometheus queries through a dry-run query
- overlapping time-based schedules
Run everything with:
make test- The current scaling engine updates
Deploymenttargets. The API type already allowsStatefulSet, but reconciliation for that target kind is not implemented yet. cooldownSecondsandevaluationIntervalSecondsare modeled in the API but are not yet enforced by the controller.
