|
| 1 | +# Vertical Pod Autoscaler (VPA) |
| 2 | + |
| 3 | +VPA monitors actual CPU/memory usage and recommends optimal resource requests for pods. |
| 4 | + |
| 5 | +## How It Works |
| 6 | + |
| 7 | +VPA is deployed in **Off mode** — it generates recommendations but does not apply them. A Kyverno ClusterPolicy (`vpa-auto-create`) automatically creates a VPA resource for every Deployment and StatefulSet in the cluster (excluding system namespaces). |
| 8 | + |
| 9 | +When you're ready to let VPA auto-tune, change the `updateMode` to `InPlaceOrRecreate` (K8s 1.35 GA feature — resizes pods without restarting them). |
| 10 | + |
| 11 | +## Reading Recommendations |
| 12 | + |
| 13 | +```bash |
| 14 | +# Quick summary of all VPA recommendations |
| 15 | +kubectl get vpa -A -o custom-columns=\ |
| 16 | +NAMESPACE:.metadata.namespace,\ |
| 17 | +NAME:.metadata.name,\ |
| 18 | +CPU:.status.recommendation.containerRecommendations[0].target.cpu,\ |
| 19 | +MEM:.status.recommendation.containerRecommendations[0].target.memory |
| 20 | + |
| 21 | +# Full detail for a specific app |
| 22 | +kubectl describe vpa <name> -n <namespace> |
| 23 | +``` |
| 24 | + |
| 25 | +Recommendations include four values per container: |
| 26 | +- **target** — what VPA thinks you should set |
| 27 | +- **lowerBound** — minimum safe value |
| 28 | +- **upperBound** — max it would recommend |
| 29 | +- **uncappedTarget** — ideal ignoring any min/max constraints |
| 30 | + |
| 31 | +## Components |
| 32 | + |
| 33 | +| Component | Purpose | |
| 34 | +|-----------|---------| |
| 35 | +| **Recommender** | Analyzes metrics, generates recommendations | |
| 36 | +| **Updater** | Applies changes when mode is not Off (evicts or in-place resizes) | |
| 37 | +| **Admission Controller** | Sets resources on new pods when mode is not Off | |
| 38 | + |
| 39 | +## Dependencies |
| 40 | + |
| 41 | +- **metrics-server** (`infrastructure/controllers/metrics-server/`) — provides the `metrics.k8s.io` API that VPA reads from |
| 42 | +- **Kyverno** — auto-generates VPA resources via `vpa-auto-create` ClusterPolicy |
| 43 | + |
| 44 | +## Notes |
| 45 | + |
| 46 | +- VPA only tracks CPU and memory — GPU (`nvidia.com/gpu`) and ephemeral-storage are not managed |
| 47 | +- Recommendations need a few hours of pod runtime to stabilize |
| 48 | +- Upper bounds will be very wide initially and tighten over days |
| 49 | +- GPU workloads will show low CPU/memory recommendations since compute happens on GPU VRAM |
0 commit comments