Description
What happened:
The metric kube_pod_status_reason
shows 0 for all reasons, even when reasons should have value of 1
.
What you expected to happen:
We use Karpenter in our clusters, and expect to be able to see when pods have a change in status based on actions Karpenter takes. In particular, we expect to see Evicted
, NodeLost
, and Shutdown
reasons to show a value of 1
in clusters where consolidation is happening all the time (consolidateAfter
value is 5m0s
). We can see in our Karpenter metrics that at any given time, some pod is being moved, and should show up with a kube_pod_status_reason
of Evicted
with a value of 1
.
How to reproduce it (as minimally and precisely as possible):
This prometheus query: sum(kube_pod_status_reason) by (reason)
shows 0
for every reason, and when charted, those value remain the same over any time interval.
Anything else we need to know?:
The kube_pod_status_phase
does not give use the information we need (specific reasons for status), and no other metric claims to provide this.
Environment:
Running KSM v2.13 managed via Helm chart
EKS v1.32.2
Karpenter v1.2.0