Welcome! We're excited that you're interested in contributing to the Workload-Variant-Autoscaler (WVA) project.
For general contribution guidelines including code of conduct, commit message format, PR process, and community standards, please see the llm-d Contributing Guide.
This document covers WVA-specific development setup and workflows.
- Go 1.25.0+
- Docker 17.03+
- kubectl 1.31.0+ (or
ocfor OpenShift) - Kind (for local development)
- Basic understanding of Kubernetes controllers and operators
-
Fork and clone the repository:
git clone https://github.com/<your-username>/workload-variant-autoscaler.git cd workload-variant-autoscaler
-
Add upstream remote:
git remote add upstream https://github.com/llm-d/llm-d-workload-variant-autoscaler.git
-
Install dependencies:
go mod download
-
Set up a local Kind cluster with emulated GPUs:
make deploy-llm-d-wva-emulated-on-kind
-
Run tests to verify setup:
make test
See Developer Guide for detailed setup instructions.
The repository uses AI-powered workflows to automate repetitive tasks:
- Documentation Updates: Automatically syncs docs with code changes
- Workflow Creation: Interactive designer for new workflows
- Workflow Debugging: Assists with troubleshooting
Learn more in the Agentic Workflows Guide.
workload-variant-autoscaler/
├── api/v1alpha1/ # CRD definitions and types
├── cmd/ # Main application entry point
├── config/ # Kubernetes manifests
│ ├── crd/ # CRD base manifests
│ ├── rbac/ # RBAC configurations
│ ├── manager/ # Controller deployment configs
│ └── samples/ # Example VariantAutoscaling CRs
├── deploy/ # Deployment scripts
│ ├── kubernetes/ # Standard K8s deployment
│ ├── openshift/ # OpenShift-specific deployment
│ └── kind-emulator/ # Local development with Kind
├── docs/ # Documentation
│ ├── user-guide/ # User-facing documentation
│ ├── developer-guide/ # Development and testing guides
│ ├── integrations/ # Integration guides (HPA, KEDA, Prometheus)
│ ├── tutorials/ # Step-by-step tutorials
│ └── design/ # Architecture and design docs
├── internal/ # Private application code
│ ├── controller/ # Main reconciliation logic
│ ├── collector/ # Metrics collection
│ ├── optimizer/ # Optimization engine
│ ├── actuator/ # Metric emission & actuation
│ ├── modelanalyzer/ # Model performance analysis
│ ├── metrics/ # Metrics definitions
│ └── utils/ # Utility functions
├── pkg/ # Public libraries (inferno optimizer)
│ ├── analyzer/ # Queue theory models
│ ├── solver/ # Optimization algorithms
│ ├── core/ # Core domain models
│ ├── config/ # Configuration structures
│ └── manager/ # Optimization manager
├── test/ # Tests
│ ├── e2e/ # End-to-end tests
│ └── utils/ # Test utilities
├── hack/ # Dev scripts (e.g. hack/burst_load_generator.sh for manual load)
└── charts/ # Helm charts
└── workload-variant-autoscaler/
Run unit tests:
make testRun E2E tests (Kind or OpenShift):
# Smoke tests (Kind)
make test-e2e-smoke
# Full suite (Kind)
make test-e2e-full
# OpenShift: set KUBECONFIG and ENVIRONMENT, then run
export ENVIRONMENT=openshift
make test-e2e-smoke
# or make test-e2e-full
# Run specific tests
FOCUS="Basic VA lifecycle" make test-e2e-smokeRun linter:
make lint
# Auto-fix linting issues
make lint-fixIf you modify the VariantAutoscaling CRD in api/v1alpha1/:
-
Generate updated manifests and code:
make manifests generate
-
Update CRD documentation:
make crd-docs
-
Verify CRD changes:
kubectl explain variantautoscaling.spec
Build the controller binary:
make buildRun controller locally (connects to configured cluster):
make runBuild Docker image:
make docker-build IMG=<your-registry>/wva-controller:tagDeploy to cluster:
make deploy IMG=<your-registry>/wva-controller:tagDeploy with llm-d for testing:
make deploy-llm-d-wva-emulated-on-kind IMG=<your-registry>/wva-controller:tagWhen making code changes, update relevant documentation in:
docs/user-guide/- User-facing changes (CRD changes, new features)docs/developer-guide/- Development workflow changesdocs/integrations/- Integration guide updatesdocs/design/- Architecture or design changesREADME.md- High-level feature changes
Verify all commands and examples work:
# Test installation steps from docs
# Test configuration examples
# Verify all code snippets are correct- Use the
loggerpackage frominternal/logger - Always use backoff retries for Kubernetes API calls (see
internal/utils) - Update Kubernetes conditions for status visibility
- Emit metrics for observability
When modifying queue models in pkg/analyzer/:
- Ensure mathematical correctness
- Add comprehensive unit tests
- Validate against real workload data when possible
- Document assumptions and limitations
When modifying solvers in pkg/solver/:
- Consider computational complexity
- Test edge cases (zero load, overload, etc.)
- Ensure feasibility checking
- Document algorithm choices
- Check Developer Guide
- Review existing code and tests
- Search GitHub Issues
- Ask in GitHub Discussions
- Attend llm-d community meetings
# Option 1: Run outside cluster (connects to KUBECONFIG cluster)
make run
# Option 2: Deploy to Kind cluster
make deploy-llm-d-wva-emulated-on-kind# View controller logs
kubectl logs -n workload-variant-autoscaler-system \
-l control-plane=controller-manager --tail=100 -f
# Check VariantAutoscaling status
kubectl describe variantautoscaling <name> -n <namespace>
# Check emitted metrics
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/wva_desired_replicas" | jq# Destroy Kind cluster
make destroy-kind-cluster
# Undeploy from cluster
make undeploy
# Uninstall CRDs
make uninstallBefore submitting your PR, ensure:
-
make testpasses -
make lintpasses (fix issues withmake lint-fix) -
make test-e2e-smokepasses (if controller logic changed; usemake test-e2e-fullfor full suite) - Documentation updated (if user-facing changes)
- CRD docs regenerated (if CRD changed):
make crd-docs - Commit messages follow conventional commits
- PR description clearly explains the change
By contributing, you agree that your contributions will be licensed under the Apache License 2.0.
Thank you for contributing to Workload-Variant-Autoscaler!