Inject component-identifying scope attributes #12617

jade-guiton-dd · 2025-03-12T14:29:02Z

Description

Fork of #12384 to showcase how component attributes can be injected into scope attributes instead of log/metric/span attributes. See that PR for more context.

To see the diff from the previous PR, filter changes starting from the "Prototype using scope attributes" commit.

Link to tracking issue

Resolves #12217
Also incidentally resolves #12213 and resolves #12117

Testing

I updated the existing tests to check for scope attributes, and did some manual testing with a debug exporter to check that the scope attributes are added/removed properly.

mx-psi · 2025-03-27T11:29:27Z

We discussed offline adding a feature gate for this and all other internal telemetry related changes, I intend to merge this once the comments are addressed and the feature gate has been added

…feature gate

internal/telemetry/componentattribute/logger_zap.go

internal/telemetry/telemetry.go

.chloggen/metric-span-component-attributes.yaml

…behind feature gate (#12933) #### Context PR #12617 introduced logic to inject new instrumentation scope attributes in all internal telemetry to identify which Collector component it came from. These attributes had already been added to internal logs as regular log attributes, and this PR switched them to scope attributes for consistency. The new logic was placed behind an Alpha stage feature gate, `telemetry.newPipelineTelemetry`. Unfortunately, the default "off" state of the feature gate disabled the injection of component-identifying attributes entirely, which was a regression since they had been present in internal logs in previous releases. See issue #12870 for an in-depth discussion of this issue. To correct this, PR #12856 was filed, which stabilized the feature gate, making it on by default, with no way to disable it, and removed the logic that the feature gate used to toggle. This was thought to be the simplest way to mitigate the regression in the "off" state, since we planned to stabilize the feature eventually anyways. Unfortunately, it was found that the "on" state of the feature gate causes a different issue: [the Prometheus exporter](https://github.com/open-telemetry/opentelemetry-go/tree/main/exporters/prometheus) is the default way of exporting the Collector's internal metrics, accessible at `collector:8888/metrics`. This exporter does not currently have any support for instrumentation scope attributes, meaning that metric streams differentiated by said attributes but not by any other identifying property will appear as aliases to Prometheus, which causes an error. This completely breaks the export of Collector metrics through Prometheus under some simple configurations, which is a release blocker. #### Description To fix this issue, this PR sets the `telemetry.newPipelineTelemetry` feature gate back to "Alpha" (off by default), and reintroduces logic to disable the injection of the new instrumentation scope attributes when the gate is off, but only in internal metrics. Note that the new logic is still used unconditionally for logs and traces, to avoid reintroducing the logs issue (#12870). This should avoid breaking the Collector in its default configuration while we try to get a fix in the Prometheus exporter. #### Link to tracking issue No tracking issue currently, will probably file one later. #### Testing I performed some simple manual testing with a config file like the following: ```yaml receivers: otlp: [...] processors: batch: exporters: debug: [...] service: pipelines: logs: receivers: [otlp] processors: [batch] exporters: [debug] traces: receivers: [otlp] processors: [batch] exporters: [debug] telemetry: metrics: level: detailed traces: [...] logs: [...] ``` The two batch processors create aliased metric streams, which are only differentiated by the new component attributes. I checked that: 1. this config causes an error in the Prometheus exporter on main; 2. the error is resolved by default after applying this PR; 3. the error reappears when enabling the feature gate (this is expected) 4. scope attributes are added on the traces and logs no matter the state of the gate.

…peline components (#12812) Depends on #12856 Resolves #12676 This is a reboot of #11311, incorporating metrics defined in the [component telemetry RFC](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/rfcs/component-universal-telemetry.md) and attributes added in #12617. The basic pattern is: - When building any pipeline component which produces data, wrap the "next consumer" with instrumentation to measure the number of items being passed. This wrapped consumer is then passed into the constructor of the component. - When building any pipeline component which consumes data, wrap the component itself. This wrapped consumer is saved onto the graph node so that it can be retrieved during graph assembly. --------- Co-authored-by: Pablo Baeyens <[email protected]>

vigneshshanmugam · 2025-06-03T01:23:11Z

receiver/otlpreceiver/otlp.go

@@ -54,7 +54,7 @@ type otlpReceiver struct {
 // responsibility to invoke the respective Start*Reception methods as well
 // as the various Stop*Reception methods to end it.
 func newOtlpReceiver(cfg *Config, set *receiver.Settings) (*otlpReceiver, error) {
-	set.Logger = telemetry.LoggerWithout(set.TelemetrySettings, componentattribute.SignalKey)
+	set.TelemetrySettings = telemetry.WithoutAttributes(set.TelemetrySettings, componentattribute.SignalKey)


@jade-guiton-dd Apologies for asking on a old PR, Was there any specific reason for not including the signal info on the exposed metrics from OTLP receiver? This would be really useful to see the amount of data ingested based on the signal type.

We did already have this info exposed in the older metrics via otelcol_receiver_accepted_log_records, otelcol_receiver_accepted_metric_points and otelcol_receiver_accepted_spans. Trying to understand if we have a plan to add them down the line? Thanks.

The OTLP receiver is internally a single object, even when configured in multiple pipelines for multiple signals. For that reason, the telemetry it emits can't easily be associated with a single signal, so it removes the "otelcol.signal" attribute from its set of attributes on startup. If we didn't do that, all telemetry from the component would be associated with whichever signal pipeline happened to be created first, which would not be helpful.

However, the OTLP receiver could manually add back a signal attribute on specific metric points which are associated with a specific signal. But I don't believe this is currently needed:

The older otelcol_receiver_X metrics (which aren't going anywhere for the foreseeable future) already differentiate between signals in their name

The new metrics emitted by pipeline auto-instrumentation (implemented in a later PR) use the original attribute set of the component before startup, which includes "otelcol.signal".

Do you have any examples of internal metrics emitted by the OTLP receiver which are lacking association with a specific signal (and which could be associated with one despite the singleton architecture)?

Thanks for the detailed answer, I was going off based on this PR and upon testing the different metrics/logs, I do see the otelcol.signal being present in all of the emitted telemetry - even the custom ones generated via mdatagen which is super cool. Thanks for making that happen 👍🏽

One thing I noticed is since we are currently treating Middlewares as part of the Extension interface, some of the pipeline attributes like signal and outcome are missing from the reported metrics. Should we treat them similar to Receivers as they are running as part of the Receiver end?

I'm not extremely familiar with the middleware interface considering how new it is (are there even any implementations of it yet?), but I think this would be of questionable use and difficult to accomplish:

Because middlewares act at the HTTP/gRPC request level, there's no generic reliable way to know which signal (if any) they're processing. This is determined later by the receiver, after the request has been processed by the middleware. The only case where I think this would be doable is if the receiver only handles a single type of signal, in which case otelcol.signal is much less useful anyway.

We only use the outcome attribute on auto-instrumented pipeline metrics, not arbitrary receiver telemetry, because the auto-instrumentation layer is at the right place to know whether the next component succeeded or not. I think adding a similar instrumentation layer inside middlewares would be difficult, but I'm not familiar enough with the middleware API to tell for sure.

However, I think we could make a stronger case for adding an attribute to middleware telemetry to know which receiver instance it's used in. This would be doable, though we would need to redesign the middleware API to pass in a new TelemetrySettings with the appropriate attributes for each call to GetHTTPHandler / GetGRPCServerOptions.

Thanks for the details again. Just to give some context, I am working with the Middleware based extension to add additional attributes to the custom telemetry records without relying on some of the proposed alternatives like Baggage propagation, Processors etc - #12316

This is determined later by the receiver, after the request has been processed by the middleware.

Makes sense 👍🏽.

However, I think we could make a stronger case for adding an attribute to middleware telemetry to know which receiver instance it's used in

That would be useful since middleware extension can basically be run on any HTTP/gRPC receivers.

Do you think its worth creating a issue for this so we can move the discussion?

Yes, I think that would make sense. It would be good to make the contributors working on the middleware interface aware that this need exists.

#### Context PR #12617, which implemented the injection of component-identifying attributes into the `zap.Logger` provided to components, introduced significant additional memory use when the Collector's pipelines contain many components (#13014). This was because we would call `zapcore.NewSamplerWithOptions` to wrap the specialized logger core of each Collector component, which allocates half a megabyte's worth of sampling counters. This problem was mitigated in #13015 by moving the sampling layer to a different location in the logger core hierarchy. This meant that Collector users that do not export their logs through OTLP and only use stdout-based logs no longer saw the memory increase. #### Description This PR aims to provide a better solution to this issue, by using the `reflect` library to clone zap's sampler core and set a new inner core, while reusing the counter allocation. (This may also be "more correct" from a sampling point of view, ie. we only have one global instance of the counters instead of one for console logs and one for each component's OTLP-exported logs, but I'm not sure if anyone noticed the difference anyway). #### Link to tracking issue Fixes #13014 #### Testing A new test was added which checks that the log counters are shared between two sampler cores with different attributes.

jade-guiton-dd added 25 commits February 13, 2025 18:33

Wrap TelemetrySettings providers to add component-identifying attributes

9238c22

Merge branch 'main' into metric-span-component-attributes

8ef0c86

Add license headers

2fbd5b8

Fix linting

88e477d

Fix bug

806fb7a

Add changelog

9e0efa5

make gotidy

213ef9d

make goporto

766266d

Fix bug 2

c242492

Fix otlp receiver test

106dd59

Rename to WithoutAttributes, unexport attribute.Set field

57519e6

Merge branch 'main' into metric-span-component-attributes

3d0711e

Add missing license

21e7841

Fix linting and import comment

90c88a5

Add special wrapper for TracerProvider from offical SDK

15dc7dd

Fix linting

81be0d3

Add test for wrapped meter provider

8124d82

Lint and add license to new test

fae6bb6

Add test for wrapped tracer provider

7f5ebd8

Add test for RegisterCallback

aa1f5e1

Update comment about API publicization

1ba12b6

Add tests for non-official SDK support

5fe0643

Updated changelog

17fdf81

Merge branch 'main' into metric-span-component-attributes

5d7a01a

Prototype using scope attributes

203c964

github-actions bot requested review from andrzej-stencel, bogdandrutu, dmathieu, dmitryax and evan-bradley March 12, 2025 14:29

jade-guiton-dd added 5 commits March 19, 2025 14:40

Merge branch 'main' into component-attributes-scope

7798c2a

Implement coreWithAttributes with pointer receiver

563b8f8

Merge branch 'main' into component-attributes-scope

b999f38

Add comments in Zap wrapper

5e525d7

Update TODO comment about otelzap again

cf8fe64

jade-guiton-dd added 4 commits March 27, 2025 13:00

Add even more comments

97eaefe

Add debug log when unable to inject component attributes

743680f

Add attribute injection behind beta "telemetry.pipelineTelemetryRfc" …

e7b5373

…feature gate

Merge branch 'main' into component-attributes-scope

dc87c89

mx-psi reviewed Mar 27, 2025

View reviewed changes

internal/telemetry/componentattribute/logger_zap.go Show resolved Hide resolved

internal/telemetry/telemetry.go Outdated Show resolved Hide resolved

internal/telemetry/telemetry.go Outdated Show resolved Hide resolved

jade-guiton-dd added 2 commits March 27, 2025 17:33

Rename feature gate and change from beta to alpha

6006784

Fix linting

71ed59d

mx-psi approved these changes Mar 28, 2025

View reviewed changes

.chloggen/metric-span-component-attributes.yaml Outdated Show resolved Hide resolved

jade-guiton-dd added 2 commits March 28, 2025 12:41

Update changelog to include feature gate

3705308

Update changelog to add note about breaking change to logs

368edce

mx-psi added this pull request to the merge queue Mar 28, 2025

Merged via the queue into open-telemetry:main with commit 54c13a9 Mar 28, 2025
52 of 56 checks passed

jade-guiton-dd mentioned this pull request Mar 31, 2025

[receiver/otlp] Review telemetry #11139

Open

3 tasks

djaglowski mentioned this pull request Apr 8, 2025

[service/internal/graph] Measure telemetry as it is passed between pipeline components #12812

Merged

jade-guiton-dd mentioned this pull request Apr 16, 2025

Missing attributes in internal Collector logs #12870

Closed

jade-guiton-dd mentioned this pull request Apr 28, 2025

Put component-identifying scope attributes for internal metrics back behind feature gate #12933

Merged

jade-guiton-dd mentioned this pull request Apr 29, 2025

Instrumentation scope attributes cause errors in Prometheus exporter #12939

Closed

sfc-gh-bnandibhatla mentioned this pull request May 10, 2025

Collector memory increases by about ~20 MB after v0.125.0 release #13014

Closed

jade-guiton-dd mentioned this pull request May 28, 2025

[service] Share log sampler core allocations with reflect magic #13107

Merged

vigneshshanmugam reviewed Jun 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inject component-identifying scope attributes #12617

Inject component-identifying scope attributes #12617

Uh oh!

jade-guiton-dd commented Mar 12, 2025 •

edited

Loading

Uh oh!

mx-psi commented Mar 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vigneshshanmugam Jun 3, 2025 •

edited

Loading

Uh oh!

jade-guiton-dd Jun 3, 2025 •

edited

Loading

Uh oh!

vigneshshanmugam Jun 3, 2025

Uh oh!

jade-guiton-dd Jun 4, 2025

Uh oh!

vigneshshanmugam Jun 4, 2025

Uh oh!

jade-guiton-dd Jun 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

Inject component-identifying scope attributes #12617

Inject component-identifying scope attributes #12617

Uh oh!

Conversation

jade-guiton-dd commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Link to tracking issue

Testing

Uh oh!

mx-psi commented Mar 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vigneshshanmugam Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jade-guiton-dd Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vigneshshanmugam Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

jade-guiton-dd Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

vigneshshanmugam Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

jade-guiton-dd Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jade-guiton-dd commented Mar 12, 2025 •

edited

Loading

vigneshshanmugam Jun 3, 2025 •

edited

Loading

jade-guiton-dd Jun 3, 2025 •

edited

Loading

jade-guiton-dd Jun 5, 2025 •

edited

Loading