Skip to content

OTLP resources for the opentelemetry-ebpf-profiler #628

Open
@Gandem

Description

@Gandem

Recap

Currently, by default, the profiler generates OTLP payloads containing a single Profile object with stack traces for all processes on a host. This creates issues with resource attributes, particularly container attributes, which were initially planned to be attached to Sample attributes, not top-level resource attributes. This incompatibility makes the profiler's approach inconsistent with other signals and processors like the k8sattributesprocessor.

This leads to the following challenges:

  • Defining resources for the profiler, considering various granularities (per process, per container, or another concept).
  • Impact on lookup tables and payload size when splitting processes into multiple ResourceProfiles.

The goal of this issue is to start a discussion on these challenges with impacted SIGs and determine next steps.

Context

The intent of this issue is to discuss potential incompatibilities between the current OpenTelemetry protocol specification for profiles and the opentelemetry-ebpf-profiler, with regards to usage of OTLP resources.

The opentelemetry-ebpf-profiler is a profiler which allows CPU profiling for all processes running on a single host.

The profiler will be built in a standalone collector distribution, where it would be configured as a collector receiver (RFC). This distribution would be deployed on each host for which the user wants to collect profiling data.

Problem Statement

Currently, by default, OTLP profiles generated by the opentelemetry-ebpf-profiler always contain a single Profile object (in a single ScopeProfile, in a single ResourceProfile), which contains stack traces for all the processes running on the host (when off CPU profiling is enabled, this adds another ScopeProfile object, under the same ResourceProfile).

As a consequence:

  • The top level resource attributes only hold general information about the host on which the profiler is running.
  • Container attributes were intended to be attached to the Sample attributes (for e.g. kubernetes pod name, deployment name, container name and id, …)

This doesn’t align with other signals (metrics, traces, logs), for which both host and container attributes are attached to the top-level resource attributes. As mentioned in open-telemetry/opentelemetry-collector-contrib#37269, this makes the opentelemetry-ebpf-profiler’s approach incompatible with the k8sattributesprocessor (the processor automatically enriches the data with kubernetes metadata and expects kubernetes attributes to be added in the top-level resource attributes).

However, in the current state of the profiling protocol specification, having kubernetes attributes as top-level resource attributes would require splitting the profile into multiple resource profiles (instead of a single profile per payload), which leads to the following problems:

Defining resources from profiled processes

The opentelemetry-ebpf-profiler profiles all processes running on a host (whether they are in pods/containers, or not). It is unclear what the exact definition of a resource should be in that case:

  • Is every single process a separate resource? This might lead to an excessive number of resources for profiled runtimes which fork a lot (e.g. Python).
  • Is every single container a resource? In which case, what do we do for non-containerized processes on the host: should we group them together in a single resource?
  • Is there any other definition we should consider in the context of the opentelemetry-ebpf-profiler?

In the current state, if we don’t intend to modify the model, and keep compatibility with the k8sattributesprocessor we need at least one resource per container.

To some extent, the problem mentioned intersects with some of the challenges mentioned in Resources and Entities.

One additional challenge is that a resource is currently defined as the entity producing telemetry - strictly speaking, the opentelemetry-ebpf-profiler is producing the profiles for all processes running on the host. In that case, the entities being observed (the different processes) are different from the entity producing the telemetry (the opentelemetry-ebpf-profiler).

Performance impact on lookup tables

Depending on the resource definition we land on, we need to be mindful of the impact on the different lookup tables. Currently, the Profile object contains lookup tables that are used to deduplicate information from stack traces: for example, this avoids having to store repeatedly the same function names, or sample/location attributes.

The goal of these lookup tables is to keep the size of a profile reasonable. While this should marginally impact the size of the payload on the wire (due to compression), it does impact the memory footprint of the decompressed, and de-serialized payload (in the ebpf profiler, then in the collector).

Splitting processes into multiple ResourceProfiles will mean that they will no longer share lookup tables. The granularity at which we split will influence the overhead (for example, splitting by process id will lead to drastically increased overhead compared to splitting per container, due to runtimes that fork often such as Python).

We could consider moving the lookup tables at the ProfilesData level, however this would make merging multiple ProfilesData payloads (for e.g. for batching) harder, since it would require merging their lookup tables (which is possible, but could require further changes to the spec to do in an efficient manner).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions