VPA: add performance benchmarking#9192
Conversation
|
Here's how I tested this locally: # in vertical-pod-autoscaler/
$ KIND_VERSION="kindest/node:v1.35.0@sha256:452d707d4862f52530247495d180205e029056831160e22870e37e3f6c1ac31f"
$ kind delete cluster --name kind
$ kind create cluster --image=${KIND_VERSION} --config=benchmark/kind-config.yaml
$ ./hack/deploy-for-e2e-locally.sh full-vpa
$ go build -C benchmark -o ../bin/vpa-benchmark .
$ bin/vpa-benchmark --profile=xxlarge
... resultsIt is possible that my local setup is more or less powerful than any one else trying this, and that might fudge the timings of the sleeps and such, so I'd love some feedback on how to make this better if it doesn't work for everyone. And yes, loops can actually take longer than the allotted updater-interval or recommender-interval times if there's too many things going on in the cluster like evictions. I had to manually increase the intervals so that wouldn't happen, but I'm assuming it's possible on someone's setup the loop will take too long and not finish, or the loop will be very fast, and didn't need all that time. Since docker shares local machine resources and all that. FWIW: I did try this out on Omer's PR, and I'm not sure if there was a notable difference. The bulk of the time is spent on # before
========== Results ==========
┌───────────────┬───────────────────┐
│ STEP │ XXLARGE ( 1000 ) │
├───────────────┼───────────────────┤
│ AdmissionInit │ 0.0004s │
│ EvictPods │ 98.6963s │
│ FilterPods │ 0.0925s │
│ ListPods │ 0.0025s │
│ ListVPAs │ 0.0027s │
│ total │ 98.7945s │
└───────────────┴───────────────────┘
# after
========== Results ==========
┌───────────────┬───────────────────┐
│ STEP │ XXLARGE ( 1000 ) │
├───────────────┼───────────────────┤
│ AdmissionInit │ 0.0003s │
│ EvictPods │ 99.1713s │
│ FilterPods │ 0.0922s │
│ ListPods │ 0.0020s │
│ ListVPAs │ 0.0026s │
│ total │ 99.2683s │
└───────────────┴───────────────────┘ |
c6c5450 to
1988ae0
Compare
omerap12
left a comment
There was a problem hiding this comment.
I’m planning to go through the code in depth, but overall I like it. Especially the use of KWOK.
There’s one bug you can reproduce by running:
$ ./bin/vpa-benchmark
=== VPA Benchmark with KWOK ===
Profiles: [small], Runs per profile: 1
Installing KWOK...
KWOK already installed
Creating KWOK fake node...
KWOK node already exists
Configuring VPA deployments...
Configured VPA deployments with increased QPS/burst and updater interval
========== Profile: small (25 VPAs) ==========
Scaling down VPA components...
Waiting for 6 VPA pods to terminate...
Waiting for 2 VPA pods to terminate...
VPA components scaled down
Deleting all VPA checkpoints...
Deleted 0 VPA checkpoints
Cleaning up existing benchmark resources...
Creating 25 ReplicaSets (2 pods each, 50 total)...
Creating 25 VPAs...
Waiting for 50 KWOK pods to be running...
Pods: 0/50
Pods: 0/50
Pods: 0/50
Pods: 0/50
Pods: 0/50
Pods: 0/50
Pods: 0/50
Pods: 0/50
Pods: 0/50
Pods: 0/50
Pods: 0/50
Pods: 0/50
Pods: 0/50
...
W0206 13:21:42.996430 69510 main.go:174] Run 1 failed: timeout waiting for pods: context deadline exceeded
No successful runs for profile small!
Final cleanup...
Benchmark completed successfully.I guess this is because I didn't enter a profile: https://github.com/kubernetes/autoscaler/pull/9192/files#diff-95216a5af7dfd121672dd3633edd84664820b5e5390eff6902f95c2bb52ede51R69
Moreover, should we also use this https://kwok.sigs.k8s.io/docs/user/resource-usage-configuration/ ?
| stagesURL := fmt.Sprintf("https://github.com/kubernetes-sigs/kwok/releases/download/%s/stage-fast.yaml", kwokVersion) | ||
| cmd = exec.CommandContext(ctx, "kubectl", "apply", "-f", stagesURL) | ||
| if output, err := cmd.CombinedOutput(); err != nil { | ||
| return fmt.Errorf("kubectl apply stage-fast.yaml failed: %v\n%s", err, output) |
There was a problem hiding this comment.
I’m debating whether we want the Go code to handle the infrastructure setup as well.
We’ll probably build a bash script on top of it, so I’m wondering if the script should install all required tools (e.g., KWOK, yq, etc.), while the Go code focuses solely on benchmarking.
There was a problem hiding this comment.
Yeah, I was debating this as well.
I think having a bash script outside makes sense to keep the binary focused on benchmarking. I was mostly thinking about keeping all the code contained in one program, but that makes more sense to me. Thanks for the feedback 👍
|
I'll look into the bug, thanks! EDIT: Hmm, I cannot recreate the bug, there might be something going in your environment. For example, the kwok-controller not changing those pods to be ready?
Hm... I'm wondering how we could benefit from controlling the resource usage, and what other situations we could performance test using this CRD? I guess it would help to be explicit in what resource usage we want the recommendations to generate from, so that we can always trigger updates/evictions from the updater. I will take a look to see if I can add ClusterResourceUsage for simulation that is outside of the initial resource requests of each pods. For now, I think the recommender is just hitting the minCPU and minMemory and recommending those which will trigger evictions anyways. |
iamzili
left a comment
There was a problem hiding this comment.
Good work Max! The PR seems reasonable to me, I have just left a few comments.
Furthermore, based on the code, the main goal of this binary is to check the duration of each phase in the updater within RunOnce. Are there any plans to do the same for the recommender, or even for the admission controller?
|
|
||
| // configureVPADeployments modifies VPA deployments with increased QPS/burst and updater interval. | ||
| // Uses kubectl get/yq/kubectl apply. | ||
| func configureVPADeployments(ctx context.Context) error { |
There was a problem hiding this comment.
just a side note related to this function: I think it would be great to make the ./hack/deploy-for-e2e-locally.sh script more versatile so that it can deploy the VPA components with custom flags, which would make this function obsolete. I proposed such an improvement in the PR whose goal is to incorporate the helm chart into ./hack/deploy-for-e2e-locally.sh:
There was a problem hiding this comment.
Ah, wasn't aware of that PR. I'll keep it in mind, thanks! I want to get rid of the setup in the go code anyways, so I'll be waiting for that to resolve.
| for i := 0; i < count; i++ { | ||
| i := i |
There was a problem hiding this comment.
| for i := 0; i < count; i++ { | |
| i := i | |
| for i := range count { |
| return err | ||
| } | ||
| fmt.Println(" Waiting for recommender to be ready...") | ||
| if err := waitForVPAPodReady(ctx, kubeClient, "vpa-recommender"); err != nil { |
There was a problem hiding this comment.
I think waitForVPAPodReady function can be used for the admission controller to wait until it is ready
| // Step 10: Wait for updater's first loop | ||
| // The updater uses time.Tick which waits the full interval before the first tick | ||
| // We set --updater-interval=2m, so wait 2 minutes for the first loop to start | ||
| fmt.Println("Waiting 2 minutes for updater's first loop...") | ||
| time.Sleep(2 * time.Minute) |
There was a problem hiding this comment.
let me ask, what is the reason for this sleep? I'm asking because we can completely rely on the mechanism implemented in the waitForAndScrapeMetrics function (i.e. polling until the vpa_updater_execution_latency_seconds_sum{step="total"} metric is present).
There was a problem hiding this comment.
The reason is because we know for sure that the first tick of the updater and recommender happens at the interval time, not at t=0, so I'm just sleeping as to not waste any execution on polling. I could in theory just add more time to the timeout of waitForAndScrapeMetrics, but again I was thinking that it's guaranteed we will have to wait that long anyways.
Yeah, It’s just something I came across. I’m not sure whether we should use it yet, but it might be worth considering as the benchmark evolves. |
This commit adds VPA benchmarking to the project. It adds code for a benchmark binary which measures the internal latency of the VPA updater loop. This allows users to test changes and how it will affect the performance of the VPA. The code and documentation lives in the vertical-pod-autoscaler/benchmark directory. Signed-off-by: Max Cao <macao@redhat.com>
Adds a new GitHub action which runs the vpa-benchmark e2e. Signed-off-by: Max Cao <macao@redhat.com>
1988ae0 to
d59e190
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: maxcao13 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Yes! Hopefully if this is approved, then a followup to add scraping the recommender and displaying metrics would come.
Sounds good, I don't think we need it yet. AFAICT it can just simulate metrics, which would be good for benchmarking the recommender, but not for the updater. |
|
I've added a commit in here to add the benchmark as a github action, and it's running right now. I can separate it to a new PR if we like and all this is approved, but I put it in here to show it off, but also to see if the github infra will agree with what I've done :-) EDIT: https://github.com/kubernetes/autoscaler/actions/runs/22002078972/job/63576978582?pr=9192 |
What type of PR is this?
/kind cleanup
/kind documentation
/area vertical-pod-autoscaler
What this PR does / why we need it:
With many changes coming to the VPA project, it would be beneficial to know if any can cause performance regression, or if a change will actually improve performance. For example, something a large change like adding
InPlaceupdate mode, or adding a performance enhancement change such as the work done in #8807. This PR proposes a new benchmarking tool to the VPA which can measure the latency of VPA components under performance load in a real cluster (Kind), and not a simulated cluster. We can then use the results on a pull request to compare with the main branch.Ideally, this would run as a github action or a prow job in CI, for reproducible and consistent environments to run on each change to the VPA which we could then compare to main/master easily. But for now, this is simply just a tool to be run locally. That would ideally come in a follow up PR.
Details about the benchmark itself, internals, quick-start, etcetera, are in the
README.mdin thevertical-pod-autoscaler/benchmarkdirectory.Which issue(s) this PR fixes:
Partially fixes #8866
Special notes for your reviewer:
This PR was done in tandem with Cursor Agent mode using the Cursor IDE.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: