Skip to content

Commit c1154aa

Browse files
authored
VPA: add performance benchmarking (#9192)
* ci(perf): add VPA performance benchmarking This commit adds VPA benchmarking to the project. It adds code for a benchmark binary which measures the internal latency of the VPA updater loop. This allows users to test changes and how it will affect the performance of the VPA. The code and documentation lives in the vertical-pod-autoscaler/benchmark directory. Signed-off-by: Max Cao <macao@redhat.com> * ci(perf): add vpa-benchmark github action Adds a new GitHub action which runs the vpa-benchmark e2e. Signed-off-by: Max Cao <macao@redhat.com> --------- Signed-off-by: Max Cao <macao@redhat.com>
1 parent f1dc9e3 commit c1154aa

File tree

8 files changed

+1425
-0
lines changed

8 files changed

+1425
-0
lines changed
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
name: Vertical Pod Autoscaler
2+
3+
on:
4+
push:
5+
paths:
6+
- 'vertical-pod-autoscaler/pkg/**'
7+
- 'vertical-pod-autoscaler/benchmark/**'
8+
- 'vertical-pod-autoscaler/go.mod'
9+
pull_request:
10+
paths:
11+
- 'vertical-pod-autoscaler/pkg/**'
12+
- 'vertical-pod-autoscaler/benchmark/**'
13+
- 'vertical-pod-autoscaler/go.mod'
14+
15+
permissions:
16+
contents: read
17+
checks: write
18+
19+
jobs:
20+
benchmark:
21+
name: benchmark
22+
runs-on: ubuntu-latest
23+
steps:
24+
- uses: actions/checkout@v6
25+
26+
- name: Set up Go
27+
uses: actions/setup-go@v6
28+
with:
29+
go-version: '1.25'
30+
cache-dependency-path: |
31+
vertical-pod-autoscaler/go.sum
32+
vertical-pod-autoscaler/e2e/go.sum
33+
vertical-pod-autoscaler/benchmark/go.sum
34+
35+
- name: Generate benchmark Kind config
36+
run: |
37+
cp .github/kind-config.yaml /tmp/benchmark-kind-config.yaml
38+
cat >> /tmp/benchmark-kind-config.yaml <<'EOF'
39+
kubeadmConfigPatches:
40+
- |
41+
kind: ClusterConfiguration
42+
apiServer:
43+
extraArgs:
44+
max-requests-inflight: "2000"
45+
max-mutating-requests-inflight: "1000"
46+
controllerManager:
47+
extraArgs:
48+
concurrent-replicaset-syncs: "500"
49+
kube-api-qps: "500"
50+
kube-api-burst: "1000"
51+
containerdConfigPatches:
52+
- |-
53+
[plugins."io.containerd.grpc.v1.cri".registry]
54+
config_path = "/etc/containerd/certs.d"
55+
EOF
56+
57+
- name: Set up KinD cluster
58+
uses: helm/kind-action@v1.13.0
59+
with:
60+
cluster_name: 'kind'
61+
config: /tmp/benchmark-kind-config.yaml
62+
registry: 'true'
63+
registry_name: 'kind-registry'
64+
registry_port: '5001'
65+
66+
- name: Install KWOK
67+
run: ./vertical-pod-autoscaler/benchmark/hack/install-kwok.sh
68+
69+
- name: Install and configure VPA
70+
run: |
71+
./vertical-pod-autoscaler/hack/deploy-for-e2e-locally.sh full-vpa
72+
./vertical-pod-autoscaler/benchmark/hack/configure-vpa.sh
73+
74+
- name: Build and run benchmark
75+
run: |
76+
go build -C vertical-pod-autoscaler/benchmark -o vpa-benchmark .
77+
./vertical-pod-autoscaler/benchmark/vpa-benchmark --profile=medium,xlarge,xxlarge --output=results.csv
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# VPA Performance Benchmark
2+
3+
Measures VPA component latencies using KWOK (Kubernetes WithOut Kubelet) to simulate pods without real resource consumption.
4+
5+
> **Note:** Currently only updater metrics are collected. Recommender metrics are planned for the future.
6+
7+
<!-- toc -->
8+
- [Prerequisites](#prerequisites)
9+
- [Quick Start (Local)](#quick-start-local)
10+
- [Manual Setup](#manual-setup)
11+
- [What It Does](#what-it-does)
12+
- [Profiles](#profiles)
13+
- [Flags](#flags)
14+
- [Metrics Collected](#metrics-collected)
15+
- [Updater Metrics](#updater-metrics)
16+
- [Scripts](#scripts)
17+
- [Cleanup](#cleanup)
18+
- [Notes](#notes)
19+
- [Performance Optimizations](#performance-optimizations)
20+
- [Caveats](#caveats)
21+
<!-- /toc -->
22+
23+
## Prerequisites
24+
25+
- Go 1.21+
26+
- kubectl
27+
- Kind
28+
- yq
29+
30+
## Quick Start (Local)
31+
32+
The `full-benchmark.sh` script handles everything end-to-end: creates a Kind cluster, deploys VPA, installs KWOK, configures VPA for benchmarking, builds and runs the benchmark.
33+
34+
```bash
35+
cd vertical-pod-autoscaler
36+
./benchmark/hack/full-benchmark.sh --profile=small --output=results.csv
37+
```
38+
39+
You can pass any benchmark flags directly:
40+
41+
```bash
42+
./benchmark/hack/full-benchmark.sh --profile=small,medium,large --runs=3 --output=results.csv
43+
```
44+
45+
## Manual Setup
46+
47+
If you prefer to run each step individually (or if the cluster already exists):
48+
49+
```bash
50+
# 1. Create a Kind cluster (the full-benchmark.sh script appends benchmark-specific
51+
# kubeadmConfigPatches to .github/kind-config.yaml automatically; for manual setup,
52+
# create a cluster with the base config or your own)
53+
kind create cluster --config=.github/kind-config.yaml
54+
55+
# 2. Deploy VPA
56+
./hack/deploy-for-e2e-locally.sh full-vpa
57+
58+
# 3. Install KWOK and create fake node
59+
./benchmark/hack/install-kwok.sh
60+
61+
# 4. Configure VPA deployments for benchmark (QPS/burst, updater interval)
62+
./benchmark/hack/configure-vpa.sh
63+
64+
# 5. Build and run
65+
go build -C benchmark -o ../bin/vpa-benchmark .
66+
./bin/vpa-benchmark --profile=small --output=results.csv
67+
```
68+
69+
## What It Does
70+
71+
The benchmark program (`main.go`) assumes the cluster is already set up with VPA, KWOK, and the fake node. It then:
72+
73+
1. For each profile run:
74+
- Scales down VPA components
75+
- Cleans up previous benchmark resources
76+
- Creates ReplicaSets with fake pods assigned directly to KWOK node (bypasses scheduler)
77+
- Creates VPAs targeting those ReplicaSets
78+
- Scales up recommender, waits for recommendations
79+
- Scales up updater, waits for its loop to complete
80+
- Scrapes `vpa_updater_execution_latency_seconds_sum` metrics
81+
2. Outputs results to stdout and/or a CSV file if specified
82+
83+
e.g., of output using this command: `bin/vpa-benchmark --profile=small,large,xxlarge`
84+
85+
```bash
86+
========== Results ==========
87+
┌───────────────┬───────────────┬────────────────┬───────────────────┐
88+
│ STEP │ SMALL ( 25 ) │ LARGE ( 250 ) │ XXLARGE ( 1000 ) │
89+
├───────────────┼───────────────┼────────────────┼───────────────────┤
90+
│ AdmissionInit │ 0.0000s │ 0.0001s │ 0.0004s │
91+
│ EvictPods │ 2.4239s │ 24.5535s │ 98.6963s │
92+
│ FilterPods │ 0.0002s │ 0.0020s │ 0.0925s │
93+
│ ListPods │ 0.0001s │ 0.0006s │ 0.0025s │
94+
│ ListVPAs │ 0.0024s │ 0.0030s │ 0.0027s │
95+
│ total │ 2.4267s │ 24.5592s │ 98.7945s │
96+
└───────────────┴───────────────┴────────────────┴───────────────────┘
97+
```
98+
99+
We can then compare the results of a code change with the results of the main branch.
100+
Ideally the benchmark would be done on the same machine (or a similar one), with the same benchmark settings (profiles and runs).
101+
102+
## Profiles
103+
104+
| Profile | VPAs | ReplicaSets | Pods |
105+
| ------- | ---- | ----------- | ---- |
106+
| small | 25 | 25 | 50 |
107+
| medium | 100 | 100 | 200 |
108+
| large | 250 | 250 | 500 |
109+
| xlarge | 500 | 500 | 1000 |
110+
| xxlarge | 1000 | 1000 | 2000 |
111+
112+
## Flags
113+
114+
| Flag | Default | Description |
115+
| ---- | ------- | ----------- |
116+
| `--profile` | small | Comma-separated profiles to run. You can run multiple profiles at once. (e.g., `--profile=small,medium`) |
117+
| `--runs` | 1 | Iterations per profile. This is used for averaging multiple runs. |
118+
| `--output` | "" | Path to output file for results table (CSV format). Output will always be printed to stdout. |
119+
| `--kubeconfig` | "" | Path to kubeconfig. Required if not using KUBECONFIG env var or ~/.kube/config. |
120+
121+
## Metrics Collected
122+
123+
### Updater Metrics
124+
125+
| Metric | Description |
126+
| ------ | ----------- |
127+
| `ListVPAs` | List VPA objects |
128+
| `ListPods` | List pods matching VPA targets |
129+
| `FilterPods` | Filter evictable pods |
130+
| `AdmissionInit` | Verify admission controller status |
131+
| `EvictPods` | Evict pods needing updates |
132+
| `total` | Total loop time |
133+
134+
## Scripts
135+
136+
| Script | Purpose |
137+
| ------ | ------- |
138+
| `hack/full-benchmark.sh` | Full local workflow (Kind + VPA + KWOK + configure + benchmark) |
139+
| `hack/install-kwok.sh` | Install KWOK controller and create fake node |
140+
| `hack/configure-vpa.sh` | Configure VPA deployments with benchmark-specific settings |
141+
142+
Environment variables accepted by the scripts:
143+
144+
| Variable | Default | Used by |
145+
| -------- | ------- | ------- |
146+
| `KWOK_VERSION` | `v0.7.0` | `install-kwok.sh` |
147+
| `KWOK_NAMESPACE` | `kube-system` | `install-kwok.sh` |
148+
| `KWOK_NODE_NAME` | `kwok-node` | `install-kwok.sh` |
149+
| `VPA_NAMESPACE` | `kube-system` | `configure-vpa.sh` |
150+
| `KIND_CLUSTER_NAME` | `kind` | `full-benchmark.sh` |
151+
152+
## Cleanup
153+
154+
```bash
155+
kind delete cluster
156+
```
157+
158+
## Notes
159+
160+
### Performance Optimizations
161+
162+
The benchmark includes several performance optimizations:
163+
164+
- `configure-vpa.sh` modifies VPA deployments using `yq`:
165+
- Sets `--kube-api-qps=100` and `--kube-api-burst=200` on all three components
166+
- Sets `--updater-interval=2m` on the updater (default is 60s)
167+
- Pods are assigned directly to the KWOK node via `nodeName`, bypassing the scheduler for faster creation
168+
- The benchmark script appends `kubeadmConfigPatches` to the base `.github/kind-config.yaml` to increase API server limits (`max-requests-inflight`, `max-mutating-requests-inflight`) and kube-controller-manager client QPS to handle the large number of API calls
169+
- Uses ReplicaSets instead of Deployments to skip the Deployment controller layer and speed up pod creation, but keep a targetRef for VPA
170+
171+
### Caveats
172+
173+
- The updater uses `time.Tick` which waits the full interval before the first tick, so the benchmark sleeps 2 minutes before polling for metrics
174+
- The benchmark uses Recreate update mode. In-place scaling is not supported on KWOK pods.
175+
- The benchmark scales down all VPA components at the start of each run, so that any caching is not a factor.
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
module k8s.io/autoscaler/vertical-pod-autoscaler/benchmark
2+
3+
go 1.25.3
4+
5+
require (
6+
github.com/olekukonko/tablewriter v1.1.3
7+
golang.org/x/sync v0.19.0
8+
k8s.io/api v0.35.1
9+
k8s.io/apimachinery v0.35.1
10+
k8s.io/autoscaler/vertical-pod-autoscaler v1.5.1
11+
k8s.io/client-go v0.35.1
12+
k8s.io/klog/v2 v2.130.1
13+
)
14+
15+
require (
16+
github.com/clipperhouse/displaywidth v0.6.2 // indirect
17+
github.com/clipperhouse/stringish v0.1.1 // indirect
18+
github.com/clipperhouse/uax29/v2 v2.3.0 // indirect
19+
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
20+
github.com/emicklei/go-restful/v3 v3.13.0 // indirect
21+
github.com/fatih/color v1.18.0 // indirect
22+
github.com/fxamacker/cbor/v2 v2.9.0 // indirect
23+
github.com/go-logr/logr v1.4.3 // indirect
24+
github.com/go-openapi/jsonpointer v0.22.4 // indirect
25+
github.com/go-openapi/jsonreference v0.21.4 // indirect
26+
github.com/go-openapi/swag v0.25.4 // indirect
27+
github.com/go-openapi/swag/cmdutils v0.25.4 // indirect
28+
github.com/go-openapi/swag/conv v0.25.4 // indirect
29+
github.com/go-openapi/swag/fileutils v0.25.4 // indirect
30+
github.com/go-openapi/swag/jsonname v0.25.4 // indirect
31+
github.com/go-openapi/swag/jsonutils v0.25.4 // indirect
32+
github.com/go-openapi/swag/loading v0.25.4 // indirect
33+
github.com/go-openapi/swag/mangling v0.25.4 // indirect
34+
github.com/go-openapi/swag/netutils v0.25.4 // indirect
35+
github.com/go-openapi/swag/stringutils v0.25.4 // indirect
36+
github.com/go-openapi/swag/typeutils v0.25.4 // indirect
37+
github.com/go-openapi/swag/yamlutils v0.25.4 // indirect
38+
github.com/google/gnostic-models v0.7.1 // indirect
39+
github.com/google/uuid v1.6.0 // indirect
40+
github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674 // indirect
41+
github.com/json-iterator/go v1.1.12 // indirect
42+
github.com/kr/text v0.2.0 // indirect
43+
github.com/mattn/go-colorable v0.1.14 // indirect
44+
github.com/mattn/go-isatty v0.0.20 // indirect
45+
github.com/mattn/go-runewidth v0.0.19 // indirect
46+
github.com/moby/spdystream v0.5.0 // indirect
47+
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
48+
github.com/modern-go/reflect2 v1.0.3-0.20250322232337-35a7c28c31ee // indirect
49+
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
50+
github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f // indirect
51+
github.com/olekukonko/cat v0.0.0-20250911104152-50322a0618f6 // indirect
52+
github.com/olekukonko/errors v1.1.0 // indirect
53+
github.com/olekukonko/ll v0.1.4-0.20260115111900-9e59c2286df0 // indirect
54+
github.com/spf13/pflag v1.0.10 // indirect
55+
github.com/x448/float16 v0.8.4 // indirect
56+
go.yaml.in/yaml/v2 v2.4.3 // indirect
57+
go.yaml.in/yaml/v3 v3.0.4 // indirect
58+
golang.org/x/net v0.50.0 // indirect
59+
golang.org/x/oauth2 v0.35.0 // indirect
60+
golang.org/x/sys v0.41.0 // indirect
61+
golang.org/x/term v0.40.0 // indirect
62+
golang.org/x/text v0.34.0 // indirect
63+
golang.org/x/time v0.14.0 // indirect
64+
google.golang.org/protobuf v1.36.11 // indirect
65+
gopkg.in/evanphx/json-patch.v4 v4.13.0 // indirect
66+
gopkg.in/inf.v0 v0.9.1 // indirect
67+
k8s.io/kube-openapi v0.0.0-20260127142750-a19766b6e2d4 // indirect
68+
k8s.io/utils v0.0.0-20260210185600-b8788abfbbc2 // indirect
69+
sigs.k8s.io/json v0.0.0-20250730193827-2d320260d730 // indirect
70+
sigs.k8s.io/randfill v1.0.0 // indirect
71+
sigs.k8s.io/structured-merge-diff/v6 v6.3.2 // indirect
72+
sigs.k8s.io/yaml v1.6.0 // indirect
73+
)
74+
75+
replace k8s.io/autoscaler/vertical-pod-autoscaler => ../

0 commit comments

Comments
 (0)