This guide covers installing Workload-Variant-Autoscaler (WVA) on your Kubernetes cluster.
- Kubernetes v1.32.0 or later
- Helm 3.x
- kubectl configured to access your cluster
- Cluster admin privileges
See the Helm Installation for detailed instructions.
Verify the installation:
kubectl get pods -n workload-variant-autoscaler-systemUsing kustomize for more control:
# Install CRDs
make install
# Deploy the controller
make deploy IMG=quay.io/llm-d/llm-d-workload-variant-autoscaler:latestSee the Kind Emulator for detailed instructions.
Key configuration options:
# custom-values.yaml
image:
repository: quay.io/llm-d/llm-d-workload-variant-autoscaler
tag: latest
pullPolicy: IfNotPresent
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 128Mi
# Enable Prometheus monitoring
prometheus:
enabled: true
servicemonitor:
enabled: true
# Optional: Multi-controller isolation
# Set a unique identifier for this controller instance
# Useful for parallel testing or multi-tenant environments
# See docs/user-guide/multi-controller-isolation.md
wva:
controllerInstance: "" # Leave empty for single controllerWVA uses ConfigMaps for cluster configuration:
- Service Classes: SLO definitions for different service tiers
See Configuration Guide for details.
WVA can work with existing autoscalers:
For HPA integration: See HPA Integration Guide
For KEDA integration: See KEDA Integration Guide
-
Check controller is running:
kubectl get deployment -n workload-variant-autoscaler-system
-
Verify CRDs are installed:
kubectl get crd variantautoscalings.llmd.ai
-
Check controller logs:
kubectl logs -n workload-variant-autoscaler-system \ deployment/workload-variant-autoscaler-controller-manager
Helm:
helm uninstall workload-variant-autoscaler -n workload-variant-autoscaler-systemKustomize:
make undeploy
make uninstall # Remove CRDsController not starting:
- Check if CRDs are installed:
kubectl get crd - Verify RBAC permissions
- Check controller logs for errors
Metrics not appearing:
- Ensure Prometheus ServiceMonitor is created
- Verify Prometheus has proper RBAC to scrape metrics
- Check network policies aren't blocking metrics endpoint
See Also:
- Configuration Guide
- Troubleshooting Guide (coming soon)
- Developer Guide