Skip to content

Commit baa360b

Browse files
jmmcorreiaChrsMarkjinja2
authored
Update cpu resource metrics to handle resize (#3559)
Co-authored-by: Christos Markou <chrismarkou92@gmail.com> Co-authored-by: Jina Jain <jjain@splunk.com>
1 parent cc263bb commit baa360b

File tree

5 files changed

+273
-72
lines changed

5 files changed

+273
-72
lines changed
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Use this changelog template to create an entry for release notes.
2+
#
3+
# If your change doesn't affect end users you should instead start
4+
# your pull request title with [chore] or use the "Skip Changelog" label.
5+
6+
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
7+
change_type: enhancement
8+
9+
# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db)
10+
component: k8s
11+
12+
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
13+
note: Update CPU metrics for container CPU limit and request to handle resize.
14+
15+
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
16+
# The values here must be integers.
17+
issues: [3558]
18+
19+
# (Optional) One or more lines of additional information to render under the primary note.
20+
# These lines will be padded with 2 spaces and then inserted directly into the document.
21+
# Use pipe (|) for multiline entries.
22+
subtext: |
23+
Introduced new metrics for container CPU limit and request that account for
24+
KEP 1287 allowing for in-place updates of container resources.

docs/non-normative/k8s-migration.md

Lines changed: 17 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -323,28 +323,32 @@ receiver were introduced as semantic conventions in:
323323
available)
324324
- [#2074](https://github.com/open-telemetry/semantic-conventions/issues/2074)
325325
- [#2197](https://github.com/open-telemetry/semantic-conventions/issues/2197)
326+
- [#3558](https://github.com/open-telemetry/semantic-conventions/issues/3558) (CPU resize: desired/current split)
327+
- [#3559](https://github.com/open-telemetry/semantic-conventions/pull/3559)
326328

327329
The changes in their metrics are the following:
328330

329331
<!-- prettier-ignore-start -->
330332

331-
| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New |
333+
| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New (SemConv) |
332334
| ---------------------------------------------------------------------------------- | ----------------------------------------------------------------- |
333-
| `k8s.container.cpu_limit` (type: `gauge`) | `k8s.container.cpu.limit` (type: `updowncounter`) |
334-
| `k8s.container.cpu_request` (type: `gauge`) | `k8s.container.cpu.request` (type: `updowncounter`) |
335-
| `k8s.container.memory_limit` (type: `gauge`) | `k8s.container.memory.limit` (type: `updowncounter`) |
336-
| `k8s.container.memory_request` (type: `gauge`) | `k8s.container.memory.request` (type: `updowncounter`) |
337-
| `k8s.container.storage_limit` (type: `gauge`) | `k8s.container.storage.limit` (type: `updowncounter`) |
338-
| `k8s.container.storage_request` (type: `gauge`) | `k8s.container.storage.request` (type: `updowncounter`) |
339-
| `k8s.container.ephemeralstorage_limit` (type: `gauge`) | `k8s.container.ephemeral_storage.limit` (type: `updowncounter`) |
340-
| `k8s.container.ephemeralstorage_request` (type: `gauge`) | `k8s.container.ephemeral_storage.request` (type: `updowncounter`) |
341-
| `k8s.container.restarts` (type: `gauge`) | `k8s.container.restart.count` (type: `updowncounter`) |
342-
| `k8s.container.ready` (type: `gauge`) | `k8s.container.ready` (type: `updowncounter`) |
343-
| `k8s.container.cpu_limit_utilization` (type: `gauge`) | `k8s.container.cpu.limit_utilization` (type: `gauge`) |
344-
| `k8s.container.cpu_request_utilization` (type: `gauge`) | `k8s.container.cpu.request_utilization` (type: `gauge`) |
335+
| `k8s.container.cpu_limit` (type: `gauge`) | `k8s.container.cpu.limit.desired`, `k8s.container.cpu.limit.current` (type: `updowncounter`) |
336+
| `k8s.container.cpu_request` (type: `gauge`) | `k8s.container.cpu.request.desired`, `k8s.container.cpu.request.current` (type: `updowncounter`) |
337+
| `k8s.container.memory_limit` (type: `gauge`) | `k8s.container.memory.limit` (type: `updowncounter`) |
338+
| `k8s.container.memory_request` (type: `gauge`) | `k8s.container.memory.request` (type: `updowncounter`) |
339+
| `k8s.container.storage_limit` (type: `gauge`) | `k8s.container.storage.limit` (type: `updowncounter`) |
340+
| `k8s.container.storage_request` (type: `gauge`) | `k8s.container.storage.request` (type: `updowncounter`) |
341+
| `k8s.container.ephemeralstorage_limit` (type: `gauge`) | `k8s.container.ephemeral_storage.limit` (type: `updowncounter`) |
342+
| `k8s.container.ephemeralstorage_request` (type: `gauge`) | `k8s.container.ephemeral_storage.request` (type: `updowncounter`) |
343+
| `k8s.container.restarts` (type: `gauge`) | `k8s.container.restart.count` (type: `updowncounter`) |
344+
| `k8s.container.ready` (type: `gauge`) | `k8s.container.ready` (type: `updowncounter`) |
345+
| `k8s.container.cpu_limit_utilization` (type: `gauge`) | `k8s.container.cpu.limit.utilization` (type: `gauge`) |
346+
| `k8s.container.cpu_request_utilization` (type: `gauge`) | `k8s.container.cpu.request.utilization` (type: `gauge`) |
345347

346348
<!-- prettier-ignore-end -->
347349

350+
**Note:** For CPU limit and request, SemConv splits each into `desired` (from spec) and `current` (from container status) to support [K8s container resource resize](https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/).
351+
348352
### K8s ResourceQuota metrics
349353

350354
The K8s ResourceQuota metrics implemented by the Collector and specifically the

docs/system/k8s-metrics.md

Lines changed: 85 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -103,10 +103,12 @@ and therefore inherit its attributes, like `k8s.pod.name` and `k8s.pod.uid`.
103103
- [Namespace metrics](#namespace-metrics)
104104
- [Metric: `k8s.namespace.phase`](#metric-k8snamespacephase)
105105
- [K8s Container metrics](#k8s-container-metrics)
106-
- [Metric: `k8s.container.cpu.limit_utilization`](#metric-k8scontainercpulimit_utilization)
107-
- [Metric: `k8s.container.cpu.request_utilization`](#metric-k8scontainercpurequest_utilization)
108-
- [Metric: `k8s.container.cpu.limit`](#metric-k8scontainercpulimit)
109-
- [Metric: `k8s.container.cpu.request`](#metric-k8scontainercpurequest)
106+
- [Metric: `k8s.container.cpu.limit.desired`](#metric-k8scontainercpulimitdesired)
107+
- [Metric: `k8s.container.cpu.limit.current`](#metric-k8scontainercpulimitcurrent)
108+
- [Metric: `k8s.container.cpu.limit.utilization`](#metric-k8scontainercpulimitutilization)
109+
- [Metric: `k8s.container.cpu.request.desired`](#metric-k8scontainercpurequestdesired)
110+
- [Metric: `k8s.container.cpu.request.current`](#metric-k8scontainercpurequestcurrent)
111+
- [Metric: `k8s.container.cpu.request.utilization`](#metric-k8scontainercpurequestutilization)
110112
- [Metric: `k8s.container.memory.limit`](#metric-k8scontainermemorylimit)
111113
- [Metric: `k8s.container.memory.request`](#metric-k8scontainermemoryrequest)
112114
- [Metric: `k8s.container.storage.limit`](#metric-k8scontainerstoragelimit)
@@ -2052,83 +2054,141 @@ This metric is [recommended][MetricRecommended].
20522054

20532055
**Description:** K8s Container level metrics captured under the namespace `k8s.container`.
20542056

2055-
### Metric: `k8s.container.cpu.limit_utilization`
2057+
### Metric: `k8s.container.cpu.limit.desired`
20562058

2057-
This metric is [recommended][MetricRecommended].
2059+
This metric is [opt-in][MetricOptIn].
20582060

2059-
<!-- semconv metric.k8s.container.cpu.limit_utilization -->
2061+
<!-- semconv metric.k8s.container.cpu.limit.desired -->
20602062
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
20612063
<!-- see templates/registry/markdown/snippet.md.j2 -->
20622064
<!-- prettier-ignore-start -->
20632065

20642066
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
20652067
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
2066-
| `k8s.container.cpu.limit_utilization` | Gauge | `1` | The ratio of container CPU usage to its CPU limit. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.container`](/docs/registry/entities/k8s.md#k8s-container) |
2068+
| `k8s.container.cpu.limit.desired` | UpDownCounter | `{cpu}` | Maximum CPU resource limit as defined by the container spec. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.container`](/docs/registry/entities/k8s.md#k8s-container) |
20672069

2068-
**[1]:** The value range is [0.0,1.0]. A value of 1.0 means the container is using 100% of its CPU limit. If the CPU limit is not set, this metric SHOULD NOT be emitted for that container.
2070+
**[1]:** This metric aligns with the limit in the
2071+
[`resources`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.34/#resourcerequirements-v1-core) field of
2072+
[K8s Container](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.34/#container-v1-core)
2073+
(spec.containers[*].resources). Also see `Desired Resources` in
2074+
<https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/> for more details.
20692075

20702076
<!-- prettier-ignore-end -->
20712077
<!-- END AUTOGENERATED TEXT -->
20722078
<!-- endsemconv -->
20732079

2074-
### Metric: `k8s.container.cpu.request_utilization`
2080+
### Metric: `k8s.container.cpu.limit.current`
20752081

20762082
This metric is [recommended][MetricRecommended].
20772083

2078-
<!-- semconv metric.k8s.container.cpu.request_utilization -->
2084+
<!-- semconv metric.k8s.container.cpu.limit.current -->
20792085
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
20802086
<!-- see templates/registry/markdown/snippet.md.j2 -->
20812087
<!-- prettier-ignore-start -->
20822088

20832089
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
20842090
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
2085-
| `k8s.container.cpu.request_utilization` | Gauge | `1` | The ratio of container CPU usage to its CPU request. | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.container`](/docs/registry/entities/k8s.md#k8s-container) |
2091+
| `k8s.container.cpu.limit.current` | UpDownCounter | `{cpu}` | Maximum CPU resource limit currently configured for a running container. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.container`](/docs/registry/entities/k8s.md#k8s-container) |
2092+
2093+
**[1]:** This metric aligns with the limit in the
2094+
[`resources`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.34/#resourcerequirements-v1-core) field of
2095+
[K8s ContainerStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.34/#containerstatus-v1-core)
2096+
(status.containerStatuses[*].resources). Also see `Actual Resources` in
2097+
<https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/> for more details.
20862098

20872099
<!-- prettier-ignore-end -->
20882100
<!-- END AUTOGENERATED TEXT -->
20892101
<!-- endsemconv -->
20902102

2091-
### Metric: `k8s.container.cpu.limit`
2103+
### Metric: `k8s.container.cpu.limit.utilization`
20922104

2093-
This metric is [recommended][MetricRecommended].
2105+
This metric is [opt-in][MetricOptIn].
20942106

2095-
<!-- markdownlint-disable -->
2096-
<!-- semconv metric.k8s.container.cpu.limit -->
2107+
<!-- semconv metric.k8s.container.cpu.limit.utilization -->
20972108
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
20982109
<!-- see templates/registry/markdown/snippet.md.j2 -->
20992110
<!-- prettier-ignore-start -->
21002111

21012112
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
21022113
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
2103-
| `k8s.container.cpu.limit` | UpDownCounter | `{cpu}` | Maximum CPU resource limit set for the container. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.container`](/docs/registry/entities/k8s.md#k8s-container) |
2114+
| `k8s.container.cpu.limit.utilization` | Gauge | `1` | The ratio of container CPU usage to its current CPU limit. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.container`](/docs/registry/entities/k8s.md#k8s-container) |
21042115

2105-
**[1]:** See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
2116+
**[1]:** The current CPU limit reflects the actual resources applied to the container, as reported by
2117+
[ContainerStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.34/#containerstatus-v1-core).
2118+
The value range is [0.0,1.0]. A value of 1.0 means the container is using 100% of its actual CPU limit.
2119+
If the CPU limit is not set, this metric SHOULD NOT be emitted for that container.
21062120

21072121
<!-- prettier-ignore-end -->
21082122
<!-- END AUTOGENERATED TEXT -->
21092123
<!-- endsemconv -->
2110-
<!-- markdownlint-restore-->
21112124

2112-
### Metric: `k8s.container.cpu.request`
2125+
### Metric: `k8s.container.cpu.request.desired`
2126+
2127+
This metric is [opt-in][MetricOptIn].
2128+
2129+
<!-- semconv metric.k8s.container.cpu.request.desired -->
2130+
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
2131+
<!-- see templates/registry/markdown/snippet.md.j2 -->
2132+
<!-- prettier-ignore-start -->
2133+
2134+
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
2135+
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
2136+
| `k8s.container.cpu.request.desired` | UpDownCounter | `{cpu}` | CPU resource requested as defined by the container spec. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.container`](/docs/registry/entities/k8s.md#k8s-container) |
2137+
2138+
**[1]:** This metric aligns with the request in the
2139+
[`resources`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.34/#resourcerequirements-v1-core) field of
2140+
[K8s Container](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.34/#container-v1-core)
2141+
(spec.containers[*].resources). Also see `Desired Resources` in
2142+
<https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/> for more details.
2143+
2144+
<!-- prettier-ignore-end -->
2145+
<!-- END AUTOGENERATED TEXT -->
2146+
<!-- endsemconv -->
2147+
2148+
### Metric: `k8s.container.cpu.request.current`
21132149

21142150
This metric is [recommended][MetricRecommended].
21152151

2116-
<!-- markdownlint-disable -->
2117-
<!-- semconv metric.k8s.container.cpu.request -->
2152+
<!-- semconv metric.k8s.container.cpu.request.current -->
21182153
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
21192154
<!-- see templates/registry/markdown/snippet.md.j2 -->
21202155
<!-- prettier-ignore-start -->
21212156

21222157
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
21232158
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
2124-
| `k8s.container.cpu.request` | UpDownCounter | `{cpu}` | CPU resource requested for the container. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.container`](/docs/registry/entities/k8s.md#k8s-container) |
2159+
| `k8s.container.cpu.request.current` | UpDownCounter | `{cpu}` | CPU resource requested currently configured for a running container. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.container`](/docs/registry/entities/k8s.md#k8s-container) |
21252160

2126-
**[1]:** See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
2161+
**[1]:** This metric aligns with the request in the
2162+
[`resources`](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.34/#resourcerequirements-v1-core) field of
2163+
[K8s ContainerStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.34/#containerstatus-v1-core)
2164+
(status.containerStatuses[*].resources). Also see `Actual Resources` in
2165+
<https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/> for more details.
2166+
2167+
<!-- prettier-ignore-end -->
2168+
<!-- END AUTOGENERATED TEXT -->
2169+
<!-- endsemconv -->
2170+
2171+
### Metric: `k8s.container.cpu.request.utilization`
2172+
2173+
This metric is [opt-in][MetricOptIn].
2174+
2175+
<!-- semconv metric.k8s.container.cpu.request.utilization -->
2176+
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
2177+
<!-- see templates/registry/markdown/snippet.md.j2 -->
2178+
<!-- prettier-ignore-start -->
2179+
2180+
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
2181+
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
2182+
| `k8s.container.cpu.request.utilization` | Gauge | `1` | The ratio of container CPU usage to its current CPU request. [1] | ![Development](https://img.shields.io/badge/-development-blue) | [`k8s.container`](/docs/registry/entities/k8s.md#k8s-container) |
2183+
2184+
**[1]:** The current CPU request reflects the request applied to the running container, as reported by
2185+
[ContainerStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.34/#containerstatus-v1-core).
2186+
The value range is [0.0,1.0]. A value of 1.0 means the container is using 100% of its actual CPU request.
2187+
If the CPU request is not set, this metric SHOULD NOT be emitted for that container.
21272188

21282189
<!-- prettier-ignore-end -->
21292190
<!-- END AUTOGENERATED TEXT -->
21302191
<!-- endsemconv -->
2131-
<!-- markdownlint-restore-->
21322192

21332193
### Metric: `k8s.container.memory.limit`
21342194

model/k8s/deprecated/metrics-deprecated.yaml

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -510,3 +510,64 @@ groups:
510510
- k8s.node
511511
instrument: updowncounter
512512
unit: "By"
513+
514+
- id: metric.k8s.container.cpu.limit
515+
type: metric
516+
metric_name: k8s.container.cpu.limit
517+
annotations:
518+
code_generation:
519+
metric_value_type: double
520+
stability: development
521+
deprecated:
522+
reason: renamed
523+
renamed_to: k8s.container.cpu.limit.desired
524+
brief: "Deprecated, use `k8s.container.cpu.limit.desired` and `k8s.container.cpu.limit.current` instead."
525+
entity_associations:
526+
- k8s.container
527+
instrument: updowncounter
528+
unit: "{cpu}"
529+
- id: metric.k8s.container.cpu.limit_utilization
530+
type: metric
531+
metric_name: k8s.container.cpu.limit_utilization
532+
annotations:
533+
code_generation:
534+
metric_value_type: double
535+
stability: development
536+
deprecated:
537+
reason: renamed
538+
renamed_to: k8s.container.cpu.limit.utilization
539+
brief: "Deprecated, use `k8s.container.cpu.limit.utilization` instead."
540+
entity_associations:
541+
- k8s.container
542+
instrument: gauge
543+
unit: "1"
544+
- id: metric.k8s.container.cpu.request
545+
type: metric
546+
metric_name: k8s.container.cpu.request
547+
annotations:
548+
code_generation:
549+
metric_value_type: double
550+
stability: development
551+
deprecated:
552+
reason: renamed
553+
renamed_to: k8s.container.cpu.request.desired
554+
brief: "Deprecated, use `k8s.container.cpu.request.desired` and `k8s.container.cpu.request.current` instead."
555+
entity_associations:
556+
- k8s.container
557+
instrument: updowncounter
558+
unit: "{cpu}"
559+
- id: metric.k8s.container.cpu.request_utilization
560+
type: metric
561+
metric_name: k8s.container.cpu.request_utilization
562+
annotations:
563+
code_generation:
564+
metric_value_type: double
565+
stability: development
566+
deprecated:
567+
reason: renamed
568+
renamed_to: k8s.container.cpu.request.utilization
569+
brief: "Deprecated, use `k8s.container.cpu.request.utilization` instead."
570+
entity_associations:
571+
- k8s.container
572+
instrument: gauge
573+
unit: "1"

0 commit comments

Comments
 (0)