Component(s)
processor/k8sattributes
What happened?
Description
We were upgrading from v0.140.1 => v0.147.0 and noticed that the otelcol for Kubernetes apps was repeatedly running out of memory and allocating new pods.
After narrowing it down to a before / after version, it appears that it is v0.145.0 that is correlated with this behaviour. After reviewing the release notes, we hypothesize that this increase is due to this change in v0.145.0:
This appears to correlate with the continually increasing nature of the metric otelcol.k8s.pod.association_total. We are running in a couple of regions. The collector (also running k8sattributes processor, identical configuration) that is much lower in traffic & fewer pods communicating with it does not experience the same growth rate as the higher-traffic one.
In our high-traffic region we have hundreds of (ephemeral) app pods sending data to this OTel Collector, which probably means a lot of IP addresses. The working theory is that pod_identifier is storing all of these and growing in size proportionally to number of pods that send data.
Steps to Reproduce
- v0.145.0
service.telemetry.level.metrics: detailed
- a bunch of apps / IP addresses sending data to it. I'm not sure how many is necessary to observe the difference. In the screenshot above, the low-traffic region is getting data from 30 pods and you can really only clearly observe it in the
otelcol.k8s.pod.association_total growth. Meanwhile the count & heap don't appear out of the ordinary. The high-traffic region is receiving data from over 200 pods during that same time period.
Expected Result
🧘♀️ Heap allocation stabilizes after startup
Actual Result
📈 Heap allocation appears to grow in size proportional to the number of pods sending traffic to the otelcol
Collector version
v0.145.0
Environment information
Environment
OS: Ubuntu 22.04.5 LTS
OpenTelemetry Collector configuration
processors:
k8sattributes:
auth_type: "serviceAccount"
passthrough: false
extract:
metadata:
- k8s.namespace.name
- k8s.node.name
- k8s.deployment.name
- k8s.pod.name
- k8s.pod.uid
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
The k8sattributesprocessor is only included in a single `traces` service pipeline.
Log output
Additional context
No response
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.
Component(s)
processor/k8sattributes
What happened?
Description
We were upgrading from v0.140.1 => v0.147.0 and noticed that the otelcol for Kubernetes apps was repeatedly running out of memory and allocating new pods.
After narrowing it down to a before / after version, it appears that it is v0.145.0 that is correlated with this behaviour. After reviewing the release notes, we hypothesize that this increase is due to this change in v0.145.0:
This appears to correlate with the continually increasing nature of the metric
otelcol.k8s.pod.association_total. We are running in a couple of regions. The collector (also running k8sattributes processor, identical configuration) that is much lower in traffic & fewer pods communicating with it does not experience the same growth rate as the higher-traffic one.In our high-traffic region we have hundreds of (ephemeral) app pods sending data to this OTel Collector, which probably means a lot of IP addresses. The working theory is that
pod_identifieris storing all of these and growing in size proportionally to number of pods that send data.Steps to Reproduce
service.telemetry.level.metrics: detailedotelcol.k8s.pod.association_totalgrowth. Meanwhile the count & heap don't appear out of the ordinary. The high-traffic region is receiving data from over 200 pods during that same time period.Expected Result
🧘♀️ Heap allocation stabilizes after startup
Actual Result
📈 Heap allocation appears to grow in size proportional to the number of pods sending traffic to the otelcol
Collector version
v0.145.0
Environment information
Environment
OS: Ubuntu 22.04.5 LTS
OpenTelemetry Collector configuration
Log output
Additional context
No response
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding
+1orme too, to help us triage it. Learn more here.