Skip to content

Commit bf97e17

Browse files
docs: add profilingmetrics use case and configuration options (#1154)
* docs: add profilingmetrics use case and configuration options * fix profiling receiver * Update connector/profilingmetricsconnector/README.md Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co> * Update connector/profilingmetricsconnector/README.md Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co> * Update connector/profilingmetricsconnector/README.md Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co> * Update connector/profilingmetricsconnector/README.md Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co> * Update connector/profilingmetricsconnector/README.md Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co> * Update connector/profilingmetricsconnector/README.md Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co> * Update connector/profilingmetricsconnector/README.md Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co> --------- Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co>
1 parent 007ca43 commit bf97e17

1 file changed

Lines changed: 205 additions & 13 deletions

File tree

Lines changed: 205 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,216 @@
1-
# Profiling metrics connector
1+
# Profiling Metrics Connector
22

3-
The Profiling metrics connector is an opinionated OTel connector that generates OTel metrics from selected OTel profiling data.
3+
<!-- status autogenerated section -->
4+
| Status | |
5+
| ------------- |-----------|
6+
| Distributions | [] |
7+
| Issues | [![Open issues](https://img.shields.io/github/issues-search/elastic/opentelemetry-collector-components?query=is%3Aissue%20is%3Aopen%20label%3Aconnector%2Fprofilingmetrics%20&label=open&color=orange&logo=opentelemetry)](https://github.com/elastic/opentelemetry-collector-components/issues?q=is%3Aopen+is%3Aissue+label%3Aconnector%2Fprofilingmetrics) [![Closed issues](https://img.shields.io/github/issues-search/elastic/opentelemetry-collector-components?query=is%3Aissue%20is%3Aclosed%20label%3Aconnector%2Fprofilingmetrics%20&label=closed&color=blue&logo=opentelemetry)](https://github.com/elastic/opentelemetry-collector-components/issues?q=is%3Aclosed+is%3Aissue+label%3Aconnector%2Fprofilingmetrics) |
8+
| Code coverage | [![codecov](https://codecov.io/github/elastic/opentelemetry-collector-components/graph/main/badge.svg?component=connector_profilingmetrics)](https://app.codecov.io/gh/elastic/opentelemetry-collector-components/tree/main/?components%5B0%5D=connector_profilingmetrics&displayType=list) |
9+
10+
[development]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#development
11+
12+
## Supported Pipeline Types
13+
14+
| [Exporter Pipeline Type] | [Receiver Pipeline Type] | [Stability Level] |
15+
| ------------------------ | ------------------------ | ----------------- |
16+
| profiles | metrics | [development] |
17+
18+
[Exporter Pipeline Type]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/connector/README.md#exporter-pipeline-type
19+
[Receiver Pipeline Type]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/connector/README.md#receiver-pipeline-type
20+
[Stability Level]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#stability-levels
21+
<!-- end autogenerated section -->
22+
23+
## Overview
24+
25+
The Profiling Metrics connector is an opinionated OpenTelemetry connector that
26+
transforms [OTel Profiles data](https://opentelemetry.io/docs/specs/otel/profiles/)
27+
into OpenTelemetry metrics. It analyzes stack traces from Profiles samples and
28+
produces per-resource delta metrics that break down exclusive CPU time by frame type
29+
(kernel, native, JVM, Go, Python, etc.), shared library, system call, kernel subsystem, and
30+
more.
31+
32+
These metrics are designed to power the
33+
[Elastic OTel Profiling Metrics integration](https://www.elastic.co/docs/reference/integrations/profilingmetrics_otel)
34+
dashboards, giving you an at-a-glance view of where your application spends
35+
its time without requiring you to query raw profiling data.
36+
37+
### What it does
38+
39+
For every batch of Profiles data the connector receives, it walks each
40+
sample's stack trace and:
41+
42+
1. **Classifies the leaf frame** into one of the supported runtime/frame types
43+
(kernel, native/C, JVM, Go, Python, Ruby, PHP, Perl, .NET, Rust, Beam, V8
44+
JS) and increments the corresponding `samples.<type>.count` metric.
45+
2. **Counts userspace vs. kernel frames** (`samples.user.count` and
46+
`samples.kernel.count`) so downstream consumers can compare and compute the total sample
47+
count as their sum.
48+
3. **Extracts shared library names** for native frames and attaches them as the
49+
`shlib_name` attribute on `samples.native.count`.
50+
4. **Classifies kernel stack traces** into subsystem categories
51+
(network/tcp, network/udp, ipc, disk, memory, synchronization) with
52+
read/write direction and protocol breakdown, exposed via `kernel_area`,
53+
`kernel_proto`, and `kernel_io` attributes on `samples.kernel.count`.
54+
5. **Extracts system call names** from kernel frames (e.g. `write`, `read`,
55+
`futex`) and attaches them as the `syscall_name` attribute on
56+
`samples.kernel.count`, enabling per-syscall analysis.
57+
6. **(Optional) Generates `samples.frame_type`** — a gauge that counts profiling
58+
frames grouped by their `frame_type` attribute.
59+
7. **(Optional) Generates `samples.classification`** — a gauge with
60+
language-specific classification (e.g. Go package or JVM class) for Go
61+
and JVM frames.
62+
8. **Supports custom aggregations** — user-defined regex patterns matched
63+
against function names, producing `samples.custom_aggregation` metrics with
64+
a user-chosen label.
65+
66+
### How it works
67+
68+
```
69+
┌────────────┐ ┌───────────────────────┐ ┌──────────────┐
70+
│ Profiles │──────▶│ profilingmetrics │──────▶│ Metrics │
71+
│ Pipeline │ │ connector │ │ Pipeline │
72+
└────────────┘ └───────────────────────┘ └──────────────┘
73+
```
74+
75+
The connector sits between a **Profiles** exporter pipeline and a **Metrics**
76+
receiver pipeline. It consumes `pprofile.Profiles`, walks stack frames using
77+
the shared dictionary (string table, location table, function table, mapping
78+
table), and emits `pmetric.Metrics` to the next consumer.
79+
80+
When `flush_interval` is greater than `0s` (default: `30s`), an internal
81+
aggregation consumer buffers and merges delta metrics in memory, flushing them
82+
collectively at each interval. This reduces metric volume and aligns data
83+
points to regular time boundaries.
84+
85+
### Use cases
86+
87+
- **Profiling cost reduction**: distill high-volume profiling data into
88+
compact, queryable metrics also amenable to LLM processing.
89+
- **Infrastructure dashboards**: visualize CPU time distribution across
90+
runtimes, kernel subsystems, and shared libraries.
91+
- **Alerting**: set threshold alerts on kernel synchronization time, network
92+
I/O, or specific function patterns via custom aggregations.
93+
94+
## Requirements
95+
96+
- An OpenTelemetry Collector build that includes the `profilingmetrics`
97+
connector. The connector is shipped as part of the
98+
[Elastic Distribution of the OpenTelemetry Collector (EDOT)](https://www.elastic.co/docs/reference/opentelemetry).
99+
- A profiling data source sending OTel profiles to the collector (e.g. the
100+
[OpenTelemetry eBPF profiler](https://github.com/open-telemetry/opentelemetry-ebpf-profiler)).
4101

5102
## Configuration
6103

7-
Any [generated metric](./metadata.yaml) can be disabled through the configuration. For example:
104+
A minimal configuration that converts profiling data into metrics and exports
105+
them over OTLP:
106+
107+
```yaml
108+
receivers:
109+
profiling:
110+
111+
connectors:
112+
profilingmetrics:
8113

114+
exporters:
115+
otlphttp:
116+
endpoint: https://my-backend:4318
117+
118+
service:
119+
pipelines:
120+
profiles:
121+
receivers: [profiling]
122+
exporters: [profilingmetrics]
123+
metrics:
124+
receivers: [profilingmetrics]
125+
exporters: [otlphttp]
9126
```
10-
metrics:
11-
samples.classification:
12-
enabled: false
13-
samples.dotnet.count:
14-
enabled: false
127+
128+
### Full configuration reference
129+
130+
The following settings can be configured:
131+
132+
```yaml
133+
connectors:
134+
profilingmetrics:
135+
# Time window for aggregating delta metrics in memory before flushing.
136+
# Set to 0s to disable aggregation and forward metrics immediately.
137+
# Default: 30s
138+
flush_interval: 30s
139+
140+
# Toggle individual metrics on or off.
141+
# See the "Metrics" section below for the full list.
142+
metrics:
143+
samples.user.count:
144+
enabled: true
145+
samples.kernel.count:
146+
enabled: true
147+
samples.native.count:
148+
enabled: true
149+
samples.go.count:
150+
enabled: true
151+
samples.jvm.count:
152+
enabled: true
153+
samples.cpython.count:
154+
enabled: true
155+
samples.dotnet.count:
156+
enabled: true
157+
samples.ruby.count:
158+
enabled: true
159+
samples.php.count:
160+
enabled: true
161+
samples.perl.count:
162+
enabled: true
163+
samples.v8js.count:
164+
enabled: true
165+
samples.rust.count:
166+
enabled: true
167+
samples.beam.count:
168+
enabled: true
169+
# Disabled by default — enable explicitly if needed:
170+
samples.frame_type:
171+
enabled: false
172+
samples.classification:
173+
enabled: false
174+
175+
# Custom aggregations let you define regex patterns matched against
176+
# function names in stack traces. Each match increments a
177+
# samples.custom_aggregation metric with the given label.
178+
aggregations:
179+
- match: "^com\\.example\\.payments\\."
180+
label: "payments"
181+
- match: "^com\\.example\\.auth\\."
182+
label: "authentication"
15183
```
16184
17-
**⚠️ Configuration Warning: Metric Dependencies**
185+
| Setting | Type | Default | Description |
186+
| ----------------- | ---------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
187+
| `flush_interval` | `duration` | `30s` | Time window for aggregating delta metrics before flushing. Set to `0s` to disable aggregation and forward metrics on every received profile. |
188+
| `metrics` | `object` | — | Per-metric toggle. Each key is a metric name (see below) with an `enabled` boolean. Unspecified metrics use their default. |
189+
| `aggregations` | `list` | `[]` | List of custom aggregation rules. Each entry has a `match` (regex applied to function names) and a `label` (value used in the `frame_type` attribute of the output metric). |
190+
191+
### Metric dependencies warning
192+
193+
To ensure data integrity and accurate ratio calculations:
194+
195+
- **Required combination**: `samples.kernel.count` and `samples.user.count`
196+
must both be enabled. Their sum is the only reliable way to compute the total
197+
sample count.
198+
- **Frame metrics**: avoid disabling specific frame metrics like
199+
`samples.native.count`. Disabling them results in a loss of information
200+
regarding shared libraries or runtime breakdown.
201+
202+
## Metrics
203+
204+
The full list of emitted metrics is documented in [documentation.md](./documentation.md).
205+
206+
Any metric can be toggled via the `metrics` configuration block (see above).
18207

19-
To ensure data integrity and accurate ratio calculations, adhere to the following rules:
20-
- Required Combination: You must enable `samples.kernel.count` and `samples.user.count`. Their sum is the only reliable way to calculate the total sample count.
21-
- Frame metrics: Avoid disabling specific frame metrics like `samples.native.count`. Disabling these results in a loss of information regarding shared libraries.
208+
## Elastic integration
22209

210+
The metrics produced by this connector are designed to be consumed by the
211+
[**Elastic OTel Profiling Metrics integration**](https://www.elastic.co/docs/reference/integrations/profilingmetrics_otel).
212+
Refer to that page for dashboard setup and field mappings.
23213

24-
[Quickstart guide](https://www.elastic.co/docs/reference/edot-collector/config/configure-profiles-collection) to use this connector as part of [EDOT](https://www.elastic.co/docs/reference/opentelemetry).
214+
For a quickstart guide on using this connector as part of the Elastic
215+
Distribution of the OpenTelemetry Collector (EDOT), see the
216+
[EDOT profiling collection guide](https://www.elastic.co/docs/reference/edot-collector/config/configure-profiles-collection).

0 commit comments

Comments
 (0)