Skip to content

[processor/k8sattributes] Unbounded memory consumption after upgrading to v0.145.0 #47669

@gracework

Description

@gracework

Component(s)

processor/k8sattributes

What happened?

Description

We were upgrading from v0.140.1 => v0.147.0 and noticed that the otelcol for Kubernetes apps was repeatedly running out of memory and allocating new pods.

screenshot of heatmap of heap allocation that looks like a bunch of smoke from chimneys

After narrowing it down to a before / after version, it appears that it is v0.145.0 that is correlated with this behaviour. After reviewing the release notes, we hypothesize that this increase is due to this change in v0.145.0:

This appears to correlate with the continually increasing nature of the metric otelcol.k8s.pod.association_total. We are running in a couple of regions. The collector (also running k8sattributes processor, identical configuration) that is much lower in traffic & fewer pods communicating with it does not experience the same growth rate as the higher-traffic one.

screenshot of 0.144.0 vs 0.145.0 depicting an overall increase in count of metrics, otelcol.k8s.pod.association_total, and heap allocation

In our high-traffic region we have hundreds of (ephemeral) app pods sending data to this OTel Collector, which probably means a lot of IP addresses. The working theory is that pod_identifier is storing all of these and growing in size proportionally to number of pods that send data.

Steps to Reproduce

  • v0.145.0
  • service.telemetry.level.metrics: detailed
  • a bunch of apps / IP addresses sending data to it. I'm not sure how many is necessary to observe the difference. In the screenshot above, the low-traffic region is getting data from 30 pods and you can really only clearly observe it in the otelcol.k8s.pod.association_total growth. Meanwhile the count & heap don't appear out of the ordinary. The high-traffic region is receiving data from over 200 pods during that same time period.

Expected Result

🧘‍♀️ Heap allocation stabilizes after startup

Actual Result

📈 Heap allocation appears to grow in size proportional to the number of pods sending traffic to the otelcol

Collector version

v0.145.0

Environment information

Environment

OS: Ubuntu 22.04.5 LTS

OpenTelemetry Collector configuration

processors:
      k8sattributes:
        auth_type: "serviceAccount"
        passthrough: false

        extract:
          metadata:
            - k8s.namespace.name
            - k8s.node.name
            - k8s.deployment.name
            - k8s.pod.name
            - k8s.pod.uid

        pod_association:
          - sources:
            - from: resource_attribute
              name: k8s.pod.ip
          - sources:
            - from: resource_attribute
              name: k8s.pod.uid
          - sources:
            - from: connection


The k8sattributesprocessor is only included in a single `traces` service pipeline.

Log output

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions