Skip to content

Spec update for callbacks on ConfigProvider to support runtime changes#4900

Open
jackshirazi wants to merge 5 commits intoopen-telemetry:mainfrom
jackshirazi:config-provider-callback
Open

Spec update for callbacks on ConfigProvider to support runtime changes#4900
jackshirazi wants to merge 5 commits intoopen-telemetry:mainfrom
jackshirazi:config-provider-callback

Conversation

@jackshirazi
Copy link

Fixes #4899

Changes

This PR extends specification/configuration/api.md to define a language-neutral
ConfigProvider change-listener contract for runtime declarative configuration updates.

Spec updates include:

  • adding Add config change listener as a required ConfigProvider operation
  • defining watched-path requirements (absolute declarative path, exact-match semantics)
  • defining callback payload semantics (path + updated ConfigProperties)
  • clarifying empty/unset behavior (newConfig is a valid instance representing an empty mapping node when unset/cleared)
  • defining delivery semantics (coalescing allowed, ordering unspecified)
  • defining lifecycle/concurrency behavior (idempotent close, post-close behavior, concurrency expectations)
  • defining error/unsupported-provider behavior (listener failure isolation, no-op registration when notifications are unsupported)

Comment on lines +84 to +86
* API implementations SHOULD document accepted path syntax in language-specific
docs and include examples such as `.instrumentation/development.general.http`
and `.instrumentation/development.java.methods`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these could probably be standard across languages

worth noting whether traversing through arrays is supported

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave a standard and language specific example, I'm fine with different examples.

I've added a line (just before this) about arrays, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I meant about

implementations SHOULD document accepted path syntax

were you thinking that, e.g. java might use .instrumentation/development.general.http path syntax, while another might use something else, e.g. instrumentation/development->general->http?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, okay I see what you meant. I see what you mean the whole thing should be standardized - is it standardized in declarative config across languages? If so then yes, let's specify standard path syntax accordingly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@open-telemetry/configuration-approvers what do you think? thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to standardize on something like JSONPath:

Or maybe some sort of abbreviated / subset of the syntax which achieves the goal while keeping the implementation burden reasonable.

Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com>
jackshirazi and others added 2 commits February 25, 2026 23:10
Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com>
* `newConfig` MUST be a valid [`ConfigProperties`](#configproperties) instance
(never null/nil/None).
* If the watched node is unset or cleared, `newConfig` MUST represent an empty
mapping node (equivalent to `{}`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set and empty vs. unset turns out to be semantically meaningful in declarative config:

# this is valid
tracer_provider:
  - processors:
       simple:
         exporter:
           console:
---
# this is invalid
tracer_provider:
  - processors:
       simple:
         exporter:

I think we need to find some way signal this difference to watchers.

operations.
* Implementations MUST document callback concurrency guarantees. If they do not,
users MUST assume callbacks may be invoked concurrently.
* Closing a registration handle MUST unregister the listener.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to indicate that a callback is required to have a close operation before specifying behavior for a close operation.


* If callback execution throws an exception, implementations SHOULD isolate the
failure to that callback and SHOULD continue notifying other callbacks.
* If a provider does not support change notifications, registration MUST still
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is defining the "noop" behavior of this operation. Elsewhere in the spec we have extracted dedicated noop documents (e.g. metrics noop). It may be time to do the same for the declarative config API.

The `ConfigProvider` MUST provide the following functions:

* [Get instrumentation config](#get-instrumentation-config)
* [Add config change listener](#add-config-change-listener)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#nit: could change listener be referring to something other than config? If not, consider dropping.

Suggested change
* [Add config change listener](#add-config-change-listener)
* [Add change listener](#add-config-change-listener)

@jack-berg
Copy link
Member

As declarative config integrates more tightly into the otel java agent, and as we start looking towards dynamic config solutions like #4738, I think a capability to allow instrumentation to respond to changes in config is essential.

Based on #4889, only PHP and Java have implemented the ConfigProvider API. So curious if @Nevay / @brettmc have identified any need for this.

As for other declarative config implementers, @codeboten, @MikeGoldsmith, @maryliag, @Kielek, @ysolomchenko, @marcalff, even if you haven't implemented ConfigProvider API yet, does this use case listening for config changes resonate with you?

@MikeGoldsmith
Copy link
Member

MikeGoldsmith commented Mar 3, 2026

A way to watch a config and automatically reload would be welcome to remove the need to restart a service to pick up new changes. I don't think it would be a hard requirement though.

@Kielek
Copy link
Member

Kielek commented Mar 3, 2026

@jack-berg, the hot reload functionality sounds great, but it should not be marked as required functionality. It should be up to the technology to decide if it can be implemented or no.

I suppose that also partial support can be considered with returned information to configuration provide (OpAMP?) that some settings cannot be applied without process restart.

@pellared
Copy link
Member

pellared commented Mar 3, 2026

Is there any prototype for this?

@pellared
Copy link
Member

pellared commented Mar 3, 2026

@jack-berg, the hot reload functionality sounds great, but it should not be marked as required functionality. It should be up to the technology to decide if it can be implemented or no.

I suppose that also partial support can be considered with returned information to configuration provide (OpAMP?) that some settings cannot be applied without process restart.

What is more, making (especially everything) "hot reload" will make the SDK less efficient because of required additional synchronization.

In my opinion, the prototype should include extensive benchmarks. I am worried that this is going to add more synchronization on the hot path.

@jack-berg
Copy link
Member

What is more, making (especially everything) "hot reload" will make the SDK less efficient because of required additional synchronization.

The hot reload proposed here is limited only to ConfigProvider, the API portion of declarative config which instrumentations use for configuration. So its not on the hot path of the internals of the SDK, but the synchornization would still on the hot path for each individual instrumentation. I.e. if an http instrumentation supports dynamic config, it would have to synchronize the logic that determines if / which HTTP request / response headers to capture (amongst other things).

This convo reminds me of the convo #4645. I initially pushed back, favoring eventual visibility without guarantees for performance reasons, but was ultimately convinced that an additional .8ns per record operation was low enough overhead to not worry. I believe the same level of synchronization and overhead would occur here as a result of instrumentation config changing.

Is there any prototype for this?

@jackshirazi has been sketching out the API here. Notably, there is no SDK implementation, nor proposed SDK spec here. I think that needs to change.

@jack-berg, the hot reload functionality sounds great, but it should not be marked as required functionality.

Yeah we should talk about this. Besides the potential performance overhead from runtime changes to instrumentation config, there's also the additional complexity required. Even if every language supported the ability to watch for changes, we can't force every instrumentation to call those watch APIs (although we could encourage, similar to how we don't force semantic conventions but encourage). What does it mean for the UX if only some instrumentation is written to be responsive to runtime changes?

Co-authored-by: Jack Berg <34418638+jack-berg@users.noreply.github.com>
@jackshirazi
Copy link
Author

Runtime changes are for few and select components. The TelemetryPolicy that this aligns to does not at all expect reload of all components, nor even that they be enabled to do so. The intention is that IF some component is enabled to handle runtime changes, THEN there is a mechanism for it to receive those changes. For Java, there are maybe a dozen components that will be implemented to adapt to runtime configuration changes, and at the moment only one instrumentation that is proposed to adapt to runtime changes. This is very targeted.

  • Only the components that are interested in runtime config changes will add a callback for the path that they are interested in, this is always likely to be a small set
  • The TelemetryPolicy pipeline that handles runtime config changes will only accept changes that are configured to be implemented
  • For an SDK, these are expected to be rare events (you only occasionally reconfigure the agent, eg for the most common example of changing sampling rate, you might change it at most a few times over the day)
  • The nature of config changes are that they are not expected to be applied instantaneously, especially since the main impetus is for a remote central config to provide changes. An eventually consistent approach is fine
  • The biggest overhead that I can see is where there is a mismatch at the level between the path that is being registered for a callback and the path that is used to make a config change. If we specify the path to be a standardized string path with dot separators per the example, it becomes a substring match which eliminates that overhead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Define ConfigProvider change-listener API for runtime config updates

6 participants