Skip to content

Bound instruments#5050

Open
jack-berg wants to merge 10 commits intoopen-telemetry:mainfrom
jack-berg:bound-instruments
Open

Bound instruments#5050
jack-berg wants to merge 10 commits intoopen-telemetry:mainfrom
jack-berg:bound-instruments

Conversation

@jack-berg
Copy link
Copy Markdown
Member

@jack-berg jack-berg commented Apr 24, 2026

Resolves #4126.

See issue for full details, but some of the key bits:

  • Adds new optional "bind" capability to synchronous instruments
  • Marked as "in development"
  • SDK implementation requirements phrase it as "A bound instrument MUST behave identically to calling the equivalent unbound recording operation with the pre-bound Attributes on each measurement". That is, the same behavior with respect to cardinality limits, exemplars, and anything else relevant. Since attributes are pre-bound, this excludes attributes processing from views, which is done at bind time.

There have been a few prototypes built:

I'm particularly interested to here from @dashpole and @cijothomas if this aligns with their prototypes and ideas.

Other interest has been expressed in .NET and Erlang, but no prototypes that I'm aware of.

@jack-berg jack-berg requested review from a team as code owners April 24, 2026 17:37
Copy link
Copy Markdown
Contributor

@dashpole dashpole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this matches what I had in mind

Comment thread specification/metrics/api.md
Comment thread specification/metrics/api.md Outdated
@dashpole
Copy link
Copy Markdown
Contributor

Here was my original go prototype: open-telemetry/opentelemetry-go#7790

It was incomplete in a few ways, but the raw performance numbers are accurate. The baseline has improved due to other optimizations, so the comparison to baseline isn't accurate anymore. I also wouldn't count it as having "support of that SIG's maintainers" as well.

Comment thread specification/metrics/api.md
Comment thread specification/metrics/api.md
Comment thread specification/metrics/sdk.md Outdated
Comment thread specification/metrics/sdk.md Outdated
Comment thread specification/metrics/sdk.md Outdated
Copy link
Copy Markdown
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall — clean, well-scoped addition.

A few minor comments inline. None are blockers for merging — happy to see this move forward and iterate on the details (lifecycle/unbind, delta temporality nuances) as the prototypes mature.

@cijothomas
Copy link
Copy Markdown
Member

#618 Found this old discussion about UnBind(), - there was support for it. Jack suggested "finish()" might be a good alternative as well.

Copy link
Copy Markdown
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a experimental looks good, my only request would be to try to make the "bind" API return an equivalent instrument.

Comment thread specification/metrics/api.md
Comment thread specification/metrics/api.md Outdated
@bogdandrutu
Copy link
Copy Markdown
Member

#618 Found this old discussion about UnBind(), - there was support for it. Jack suggested "finish()" might be a good alternative as well.

Naming is hard, Unbind/Close/Finish/Unregister/etc. As stated there, we need to define what it means to use an "unregistered" instrument (bound).

@cijothomas
Copy link
Copy Markdown
Member

#618 Found this old discussion about UnBind(), - there was support for it. Jack suggested "finish()" might be a good alternative as well.

Naming is hard, Unbind/Close/Finish/Unregister/etc. As stated there, we need to define what it means to use an "unregistered" instrument (bound).

https://github.com/open-telemetry/opentelemetry-specification/pull/4702/changes This is the "finish" PR I was referring to. I think Jack meant this when he said "Finish()".

@bogdandrutu
Copy link
Copy Markdown
Member

@cijothomas Agreed, only proposed alternative names.

Copy link
Copy Markdown
Contributor

@dashpole dashpole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also prefer to return the same instrument interface to keep this simpler for users. But I'm OK with this as experimental to get feedback.

@jack-berg
Copy link
Copy Markdown
Member Author

Thanks for the reviews / perspective @cijothomas, @bogdandrutu, @dashpole, @reyang!

I think we have some learning to do around ergonomics and appropriate use which will play out best by shipping and gathering feedback on prototype implementations.

I've seen a couple of similar perspectives, which I also share:

But I'm OK with this as experimental to get feedback.
As a experimental looks good
happy to see this move forward and iterate on the details

Let's work through these conversations (and any others the pop up), and try to shoot for landing something that allows us to ship prototypes which are directionally aligned, but with enough leeway that we can experiment with things like naming, ergonomics, and anything else we might want to solicit feedback on before stabilizing sometime in the future.

@cijothomas
Copy link
Copy Markdown
Member

#5050 (comment)
@jack-berg this is the only comment from me worth addressing in this PR. I already approved, so good to merge as-is also from my side.

Copy link
Copy Markdown
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few spec-level concerns remain before this reads as standardized behavior across SDKs.

The main gap is that the current wording still leaves too much room for materially different implementations around bind-time view processing, cardinality/eviction guarantees, and the API contract of the returned bound handle. The prototypes are useful here because they show those choices are not hypothetical: Java, Rust, and Go are already exploring different shapes and lifetime behaviors.

I left the concrete points inline.

Comment thread specification/metrics/sdk.md
Comment thread specification/metrics/sdk.md Outdated
Comment thread specification/metrics/api.md
bryantbiggs added a commit to bryantbiggs/opentelemetry-rust that referenced this pull request Apr 30, 2026
The previous Fallback path re-routed every bound.add() through the
unbound measure() path, creating a ~25x perf cliff at the cardinality
limit (~50ns vs ~1.8ns) and letting attribution drift across delta
eviction cycles as space freed and refilled.

ValueMap::bind() now looks up or lazily creates the overflow
TrackerEntry at the limit and returns a direct handle to it. The bound
handle writes directly to overflow via a single atomic update for its
lifetime — perf parity with a normal bind. To recover after delta
eviction frees space, drop the handle and rebind.

Spec-aligned with open-telemetry/opentelemetry-specification#5050:
overflow data lands in the otel.metric.overflow=true bucket (MUST)
and per-call attribute lookup is bypassed (SHOULD).

Bench (Apple M4 Max):
  Counter_Bound_AtOverflow_Delta:    1.82 ns
  Histogram_Bound_AtOverflow_Delta:  6.58 ns
bryantbiggs added a commit to bryantbiggs/opentelemetry-rust that referenced this pull request Apr 30, 2026
The previous Fallback path re-routed every bound.add() through the
unbound measure() path, creating a ~25x perf cliff at the cardinality
limit (~50ns vs ~1.8ns) and letting attribution drift across delta
eviction cycles as space freed and refilled.

ValueMap::bind() now looks up or lazily creates the overflow
TrackerEntry at the limit and returns a direct handle to it. The bound
handle writes directly to overflow via a single atomic update for its
lifetime — perf parity with a normal bind. To recover after delta
eviction frees space, drop the handle and rebind.

Spec-aligned with open-telemetry/opentelemetry-specification#5050:
overflow data lands in the otel.metric.overflow=true bucket (MUST)
and per-call attribute lookup is bypassed (SHOULD).

Bench (Apple M4 Max):
  Counter_Bound_AtOverflow_Delta:    1.82 ns
  Histogram_Bound_AtOverflow_Delta:  6.58 ns
@jack-berg
Copy link
Copy Markdown
Member Author

jack-berg commented May 1, 2026

Pushed 12b46b8 to address a variety of the conversations here:

  • Clarified Bind allows dedicated type or reuse of existing interface
  • Removed Bound Add/Bound Record per-instrument subsections
  • Strengthened SDK: attribute processing and cardinality at bind time, fixed aggregator lifetime, MUST bypass map lookup

I marked a variety of conversations as resolved. Please re-open them if you don't think they're addressed or you disagree with the resolution.

Comment thread specification/metrics/sdk.md Outdated
Comment thread specification/metrics/api.md
Copy link
Copy Markdown
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving with new changes. Looks solid, excited to see prototypes landing soon!

recording operations on the returned bound instrument negates the performance benefits
of binding.

Measurements recorded on the bound instrument can be associated with the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this clarify how Context is associated with bound measurements?

The SDK section says a bound instrument should behave the same as calling the unbound instrument with the pre bound attributes, including exemplar behavior. Since exemplars can depend on Context, it would help to spell out the intended behavior here.

For example, should bound.add(value) behave like counter.add(value, boundAttributes) using the normal/current Context behavior? And if a language exposes an explicit Context overload for unbound recordings, should it also expose an equivalent bound recording operation?

This would make it easier for SDKs to keep bound and unbound behavior consistent.

[Attribute processing](#measurement-processing) and [cardinality limit](#cardinality-limits)
evaluation MUST be performed at bind time. The resolved aggregator is fixed for the
lifetime of the bound instrument and does not change across collection cycles.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This model makes sense to me. One small clarification that might help implementers: once a bound instrument resolves to an aggregator at bind time, it should keep writing to that same aggregator even if later measurements with other attribute sets hit the cardinality limit or overflow bucket.

In other words, existing bound handles are not re evaluated or rerouted later; only new binds/new attribute sets would be affected by the current cardinality state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for bound instruments to the metrics API

8 participants