Bump prometheus-operator version to 0.88.0 #3789

ronaldngounou · 2026-01-20T23:31:48Z

Context

This prometheus-operator manifest is too old from 10 months ago. This PR upgrades the prometheus-operator version to 0.88.0 as it is the latest version at the time this PR is raised.

Summary

To bump the prometheus-operator, I have reconciled every file in the manifests folder against the latest upstream manifest and did a 3-way merge.
I created a new presubmit in the same folder as xref in order to have the presubmit run at 100 nodes. As a consequence, the 100 nodes scale tests are now becoming mandatory as presubmits before running 5000 scale tests. This helps to have the kops job capture logs from prometheus-operator. DONE in PR Add presubmit pull-perf-tests-ec2-master-scale-performance-100 test-infra#36280
Updated the prometheus to the latest version at this time, v3.9.1 as found in the latest release history.

Impact

This PR fixes the APIResponsivenessPrometheus regression at the source in order to:

address any kubernetes release blocking by vendors.
unblock the release for kubernetes 1.36 by the 1.36 k8s Release Team.

Logs

At the moment, the job (https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/ci-kubernetes-e2e-kops-aws-scale-amazonvpc-using-cl2/2013507147345694720) has been failing due to as observed in the build logs below as well as in scalability test failures.

From the logs:

{ Failure :0
[measurement call APIResponsivenessPrometheus - APIResponsivenessPrometheus error: top latency metric: there should be no high-latency requests, but: [got: &{Resource:events Subresource: Verb:DELETE Scope:namespace Latency:perc50: 59.925s, perc90: 1m0s, perc99: 1m0s Count:66 SlowCount:66}; expected perc99 <= 30s]]
:0}

From https://storage.googleapis.com/kubernetes-ci-logs/logs/ci-kubernetes-e2e-kops-aws-scale-amazonvpc-using-cl2/2013507147345694720/build-log.txt , the ec2-master-scale test is failing due to:

�������Failure�3no endpoints available for service "prometheus-k8s""�ServiceUnavailable0����"�
W0120 07:27:54.118590   22340 util.go:72] error while calling prometheus api: the server is currently unable to handle the request (get services http:prometheus-k8s:9090), response: k8s�

�v1��Status�g

References

https://kubernetes.slack.com/archives/C09QZTRH7/p1767977126452879

Related PRs:

[kubernetes/test-infra] Add presubmit pull-perf-tests-ec2-master-scale-performance-100 #36280 Add presubmit pull-perf-tests-ec2-master-scale-performance-100 test-infra#36280 aimed to add presubmit running optionally, mimiccing the master scale test at 100 nodes.

cc: @upodroid @dims @mengqiy

ronaldngounou · 2026-01-21T01:44:44Z

/retest

ronaldngounou · 2026-01-21T21:53:09Z

/retest

upodroid · 2026-01-22T07:01:44Z

/test pull-perf-tests-ec2-master-scale-performance-100

ronaldngounou · 2026-01-22T08:15:23Z

/test pull-perf-tests-ec2-master-scale-performance-100

In order to help address the ec2-master-scale-performance job failing in https://testgrid.k8s.io/amazon-ec2-release\#ec2-master-scale-performance, this PR upgrades the prometheus-operator version to 0.88.0. As a result, it will help address the job (https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/ci-kubernetes-e2e-kops-aws-scale-amazonvpc-using-cl2/2013507147345694720) failing due to <prometheus no endpoints available for service ServiceUnavailable error> observed in the build logs. Signed-off-by: Ronald Ngounou <rngounou@amazon.com>

k8s-ci-robot · 2026-01-22T09:33:55Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ronaldngounou
Once this PR has been reviewed and has the lgtm label, please assign marseel for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

clusterloader2/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ronaldngounou · 2026-01-22T09:34:44Z

/test pull-perf-tests-ec2-master-scale-performance-100

ronaldngounou · 2026-01-22T18:30:19Z

/test pull-perf-tests-ec2-master-scale-performance-100

ronaldngounou · 2026-01-22T18:30:37Z

/test pull-perf-tests-gce-master-scale-performance-100

ronaldngounou · 2026-01-22T23:09:48Z

/test pull-perf-tests-gce-master-scale-performance-100

ronaldngounou · 2026-01-22T23:10:22Z

/test pull-perf-tests-ec2-master-scale-performance-100

ronaldngounou · 2026-01-24T01:15:55Z

clusterloader2/pkg/prometheus/manifests/prometheus-prometheus.yaml

  podMonitorSelector: {}
  priorityClassName: system-node-critical
-  version: v2.40.0
+  version: v3.9.1


Latest prometheus release:
https://github.com/prometheus/prometheus/releases

ronaldngounou · 2026-01-24T01:18:11Z

clusterloader2/pkg/prometheus/manifests/prometheus-prometheus.yaml

  logLevel: debug
  enableAdminAPI: true
-  baseImage: gcr.io/k8s-testimages/quay.io/prometheus/prometheus
+  image: quay.io/prometheus/prometheus


Addresses

msg="field \"spec.baseImage\" is deprecat ed, field \"spec.image\" should be used instead" component=prometheus-con troller key=monitoring/k8s

in https://storage.googleapis.com/kubernetes-ci-logs/pr-logs/pull/perf-tests/3789/pull-perf-tests-ec2-master-scale-performance-100/2014250539839131648/artifacts/cluster-info/monitoring/prometheus-operator-85dc565557-87snd/prometheus-operator.log

k8s-ci-robot requested review from marseel and wojtek-t January 20, 2026 23:32

k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 20, 2026

ronaldngounou force-pushed the bump-prometheus-operator-to088 branch from 3c19ec2 to 6a310e0 Compare January 21, 2026 00:49

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 21, 2026

ronaldngounou force-pushed the bump-prometheus-operator-to088 branch 3 times, most recently from bf0702e to 6871432 Compare January 21, 2026 00:55

ronaldngounou force-pushed the bump-prometheus-operator-to088 branch from 93d75ee to 93e0e4d Compare January 21, 2026 18:20

ronaldngounou mentioned this pull request Jan 21, 2026

Add presubmit pull-perf-tests-ec2-master-scale-performance-100 kubernetes/test-infra#36280

Merged

ronaldngounou force-pushed the bump-prometheus-operator-to088 branch from f2798eb to 3a979ac Compare January 22, 2026 08:08

ronaldngounou force-pushed the bump-prometheus-operator-to088 branch from 3a979ac to 68c64be Compare January 22, 2026 09:27

ronaldngounou force-pushed the bump-prometheus-operator-to088 branch from 68c64be to 51068fa Compare January 22, 2026 09:33

ronaldngounou changed the title ~~Bump prometheus-operator version to 0.88.0 to resolve ec2-master-scale-performance failures~~ Resolve ec2-master-scale-performance regressions by updating prometheus-operator version to 0.88.0 Jan 22, 2026

ronaldngounou commented Jan 24, 2026

View reviewed changes

ronaldngounou changed the title ~~Resolve ec2-master-scale-performance regressions by updating prometheus-operator version to 0.88.0~~ Bunmp prometheus-operator version to 0.88.0 to address ec2-master-scale-performance slowness Jan 24, 2026

ronaldngounou changed the title ~~Bunmp prometheus-operator version to 0.88.0 to address ec2-master-scale-performance slowness~~ Bump prometheus-operator version to 0.88.0 to mitigate ec2-master-scale-performance slowness failures Jan 24, 2026

ronaldngounou changed the title ~~Bump prometheus-operator version to 0.88.0 to mitigate ec2-master-scale-performance slowness failures~~ Bump prometheus-operator version to 0.88.0 Jan 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump prometheus-operator version to 0.88.0 #3789

Bump prometheus-operator version to 0.88.0 #3789

ronaldngounou commented Jan 20, 2026 •

edited

Loading

Uh oh!

ronaldngounou commented Jan 21, 2026

Uh oh!

ronaldngounou commented Jan 21, 2026

Uh oh!

upodroid commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

k8s-ci-robot commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

ronaldngounou Jan 24, 2026 •

edited

Loading

Uh oh!

ronaldngounou Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Bump prometheus-operator version to 0.88.0 #3789

Are you sure you want to change the base?

Bump prometheus-operator version to 0.88.0 #3789

Conversation

ronaldngounou commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Summary

Impact

Logs

References

Related PRs:

Uh oh!

ronaldngounou commented Jan 21, 2026

Uh oh!

ronaldngounou commented Jan 21, 2026

Uh oh!

upodroid commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

k8s-ci-robot commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

ronaldngounou commented Jan 22, 2026

Uh oh!

ronaldngounou Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ronaldngounou Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ronaldngounou commented Jan 20, 2026 •

edited

Loading

ronaldngounou Jan 24, 2026 •

edited

Loading