-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add karpenter_pods_drained_total metric to track pod draining by reason #2044
base: main
Are you sure you want to change the base?
feat: add karpenter_pods_drained_total metric to track pod draining by reason #2044
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: omerap12 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…y reason Signed-off-by: Omer Aplatony <[email protected]>
c5cfc20
to
e1588e7
Compare
Pull Request Test Coverage Report for Build 13618713849Details
💛 - Coveralls |
/assign @engedaam |
crmetrics.Registry, | ||
prometheus.CounterOpts{ | ||
Namespace: metrics.Namespace, | ||
Subsystem: metrics.NodeSubsystem, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be in the metrics.PodsSubsystem
-- as it stands right now this is called karpenter_nodes_pods_drained_total
which doesn't sound right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
I also think we need a rebase with one of the latest changes that was made to this section of code |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
It might also be nice to add a test to make sure that we are properly emitting the metric -- I think you should be able to find similar metric tests for node eviction requests total |
Fixes #2021
Description
Implements a new Prometheus metric to count pods drained during node termination, labeled by the reason for draining. This provides visibility into the number of pods affected by different termination scenarios.
How was this change tested?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.