Guide for developers contributing to Workload-Variant-Autoscaler.
- Go 1.25.0+
- Docker 17.03+
- kubectl 1.32.0+
- Kind (for local testing)
- Make
-
Clone the repository:
git clone https://github.com/llm-d/llm-d-workload-variant-autoscaler.git cd llm-d-workload-variant-autoscaler -
Install dependencies:
go mod download
-
Install development tools:
make setup-envtest make controller-gen make kustomize
workload-variant-autoscaler/
├── api/v1alpha1/ # CRD definitions
├── cmd/ # Main application entry points
├── config/ # Kubernetes manifests
│ ├── crd/ # CRD manifests
│ ├── rbac/ # RBAC configurations
│ ├── manager/ # Controller deployment
│ └── samples/ # Example resources
├── deploy/ # Deployment scripts
│ ├── kubernetes/ # K8s deployment
│ ├── openshift/ # OpenShift deployment
│ └── kind-emulator/ # Local Kind cluster with GPU emulation
├── docs/ # Documentation
├── internal/ # Private application code
│ ├── actuator/ # Metric emission & scaling
│ ├── collector/ # Metrics collection
│ ├── config/ # Internal configuration
│ ├── constants/ # Application constants
│ ├── controller/ # Controller implementation
│ ├── datastore/ # Data storage abstractions
│ ├── discovery/ # Resource discovery
│ ├── engines/ # Scaling engines (saturation, scale-from-zero)
│ ├── indexers/ # Kubernetes indexers
│ ├── interfaces/ # Interface definitions
│ ├── logging/ # Logging utilities
│ ├── metrics/ # Metrics definitions
│ ├── modelanalyzer/ # Model analysis
│ ├── saturation/ # Saturation detection logic
│ └── utils/ # Utility functions
├── pkg/ # Public libraries
│ ├── analyzer/ # Queue theory models
│ ├── solver/ # Optimization algorithms
│ ├── core/ # Core domain models
│ ├── config/ # Configuration structures
│ └── manager/ # Manager utilities
├── test/ # Tests
│ ├── e2e/ # E2E tests (consolidated suite: Kind, OpenShift)
│ └── utils/ # Test utilities
└── charts/ # Helm charts
└── workload-variant-autoscaler/# Run the controller on your machine (connects to configured cluster)
make runThe recommended approach is the one-shot command that creates the cluster and deploys everything in a single step, avoiding a CRD timing race (see Known Setup Issues):
# Recommended: create cluster + deploy WVA + llm-d infrastructure in one step
CREATE_CLUSTER=true make deploy-wva-emulated-on-kindAlternatively, as two separate steps:
# Step 1: Create a Kind cluster with emulated GPUs
make create-kind-cluster
# Or deploy with the full llm-d infrastructure
make deploy-wva-emulated-on-kindNote: If the two-step approach fails with
no matches for kind "InferencePool", see Known Setup Issues.
-
Create a feature branch:
git checkout -b feature/my-feature
-
Make your changes
-
Generate code if needed:
# After modifying CRDs make manifests generate -
Run unit tests:
make test -
Run linter:
make lint
make buildThe binary will be in bin/manager.
make docker-build IMG=<your-registry>/wva-controller:tagmake docker-push IMG=<your-registry>/wva-controller:tagPLATFORMS=linux/arm64,linux/amd64 make docker-buildx IMG=<your-registry>/wva-controller:tag# Run all unit tests
make test
# Run specific package tests
go test ./internal/controller/...
# With coverage
go test -cover ./...WVA has a single consolidated E2E suite (test/e2e/) that runs on Kind (emulated) or OpenShift/kubernetes. Deploy infrastructure in infra-only mode first, then run tests.
Location: test/e2e/
# Smoke tests (Kind, ~5-10 min)
make test-e2e-smoke
# Full suite (Kind)
make test-e2e-full
# OpenShift: set KUBECONFIG and ENVIRONMENT=openshift first
export ENVIRONMENT=openshift
make test-e2e-smoke
# or make test-e2e-full
# Run specific tests
FOCUS="Basic VA lifecycle" make test-e2e-smokeSee Testing Guide and E2E Test Suite README for infra-only setup and configuration. For OpenShift, set ENVIRONMENT=openshift and use the same targets.
-
Deploy to Kind cluster:
make deploy-wva-emulated-on-kind IMG=<your-image>
-
Create test resources:
kubectl apply -f config/samples/
-
Monitor controller logs:
kubectl logs -n workload-variant-autoscaler-system \ deployment/workload-variant-autoscaler-controller-manager -f
# Generate deepcopy, CRD manifests, and RBAC
make manifests generatemake crd-docsOutput will be in docs/user-guide/crd-reference.md.
Create .vscode/launch.json:
{
"version": "0.2.0",
"configurations": [
{
"name": "Debug Controller",
"type": "go",
"request": "launch",
"mode": "auto",
"program": "${workspaceFolder}/cmd/main.go",
"env": {
"KUBECONFIG": "${env:HOME}/.kube/config"
},
"args": []
}
]
}# Build debug image
go build -gcflags="all=-N -l" -o bin/manager cmd/main.go
# Deploy and attach debugger (e.g., Delve)kubectl logs -n workload-variant-autoscaler-system \
-l control-plane=controller-manager --tail=100 -f- Modify
api/v1alpha1/variantautoscaling_types.go - Run
make manifests generate - Update tests
- Run
make crd-docs - Update user documentation
- Define metric in
internal/metrics/metrics.go - Emit metric from appropriate controller location
- Update Prometheus integration docs
- Add to Grafana dashboards (if applicable)
- Update code in
pkg/solver/orpkg/analyzer/ - Add/update unit tests
- Run
make test - Update design documentation if algorithm changes
After code changes, update relevant docs in:
docs/user-guide/- User-facing changesdocs/design/- Architecture/design changesdocs/integrations/- Integration guide updates
Note: Documentation updates are partially automated via the Update Docs workflow. The workflow analyzes code changes and creates draft PRs with documentation updates.
Verify all commands and examples in documentation work:
# Test installation steps
# Test configuration examples
# Test all code snippetsThe repository uses AI-powered workflows to automate documentation updates, workflow creation, and debugging. These workflows are powered by the gh-aw CLI extension.
Key workflows:
- Update Docs: Automatically updates documentation on every push to main
- Create Agentic Workflow: Interactive workflow designer
- Debug Agentic Workflow: Workflow debugging assistant
See Agentic Workflows Guide for detailed information on working with these automation tools.
See the Release Process guide for how to cut a release. It covers:
- Pre-release checklist (changelog, optional version bumps, upstream pins)
- Creating the tag and GitHub Release (which triggers image build and Helm chart publish)
- What runs automatically: Docker image push, Helm chart version bump and publish to GHCR, and commit-back of chart files
- Post-release (required): update the llm-d workload-autoscaling guide to the new WVA version
- Enabling other team members to perform releases (permissions, secrets, documentation)
- Check CONTRIBUTING.md
- Review existing code and tests
- Ask in GitHub Discussions
- Attend community meetings
# Format code
make fmt
# Vet code
make vet
# Run linter
make lint
# Fix linting issues
make lint-fix
# Clean build artifacts
rm -rf bin/ dist/
# Reset Kind cluster
make destroy-kind-cluster
make create-kind-cluster- Review Code Style Guidelines
- Check out Good First Issues
Symptom:
Error: no matches for kind "InferencePool" in version "inference.networking.x-k8s.io/v1alpha2"
ensure CRDs are installed first
Cause: When running make create-kind-cluster and make deploy-wva-emulated-on-kind as two
separate commands, there can be a timing race: the Gateway API Inference Extension CRDs are
applied by the deploy script but the Kubernetes API server hasn't finished registering them
before the helmfile tries to deploy the InferencePool resource.
Fix (Option 1 — preferred): Use the one-shot command, which gives the API server enough time to register the CRDs during cluster startup:
CREATE_CLUSTER=true make deploy-wva-emulated-on-kindFix (Option 2): If you already have a running cluster and hit this error, install the CRDs manually and re-run the deploy (it is idempotent):
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.1/manifests.yaml
kubectl wait --for=condition=Established crd/inferencepools.inference.networking.x-k8s.io --timeout=30s
kubectl wait --for=condition=Established crd/inferencepools.inference.networking.k8s.io --timeout=30s
make deploy-wva-emulated-on-kind