Skip to content

[OTEP] Federate Semantic Conventions#4906

Draft
jsuereth wants to merge 2 commits intoopen-telemetry:mainfrom
jsuereth:wip-federated-semconv
Draft

[OTEP] Federate Semantic Conventions#4906
jsuereth wants to merge 2 commits intoopen-telemetry:mainfrom
jsuereth:wip-federated-semconv

Conversation

@jsuereth
Copy link
Contributor

DO NOT MERGE

This is a draft proposal to federate the semantic-conventions repository to allow for faster innovation and evolution across the ecosystem.

  • We retain clear ownership of semantic convention domains
  • We retain clear policy and tooling support for instrumentation authors to provide comprehensive documentation, stability guarantees and integration test assurance of the signals the produce.
  • We provide a clear "shared" location for x-cutting attributes and definitions.
  • We provide a path for federated semantic conventions to promote to core, if desired.
  • We expand for non-opentelemetry sources of conventions and standards to advertise their instrumentation and leverage the same tooling as open-telemetry.

```
#### Registry Requirements

- The registry MUST declare a dependency on core semantic conventions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that there is no chance of having conflicts / vendoring in in incompatible manner / etc?

I like it!

Copy link
Member

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is awesome!


- The registry MUST declare a dependency on core semantic conventions.
- The registry MUST use a dependabot or rennovate bot to keep dependencies up-to-date.
- The registry MUST enforce semantic convention policies via github workflow, e.g.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe explain what policies are (link to some section/doc), I imagine it's not a commonly known concept

- name: verify template packages
run: weaver registry check \
-r {my_registry_dir} \
-p https://github.com/open-telemetry/opentelemetry-weaver-packages.git[policies/check/naming_conventions] \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's the same on as the last one? (policies/check/naming)

#### Independent Versioning
- It releases `v2.0.0` of the `jvm` federated registry.
- This release is **completely independent** of the core `semconv` registry (which might still be at `v1.45.0`) and other registries like `http` or `messaging`.
- Users who want the new JVM metrics can opt-in by updating their instrumentation to point to the new `schema_url`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

I believe we now have the opt-in mechanism that's slightly different than picking a schema url, it was just introduced in declarative config: https://github.com/open-telemetry/opentelemetry-configuration/blob/4351ebd2805746d047a23588f8d3abe89f40f79f/snippets/ExperimentalInstrumentation_kitchen_sink.yaml#L52

E.g.

instrumentation:
  general:
    rpc:
      semconv:
        version: 1
        experimental: false
        dual_emit: true

- This release is **completely independent** of the core `semconv` registry (which might still be at `v1.45.0`) and other registries like `http` or `messaging`.
- Users who want the new JVM metrics can opt-in by updating their instrumentation to point to the new `schema_url`.
- Existing users of `v1.x.x` are unaffected and continue to see the old OTLP output.
- **Policy Enforcement**: The federated registry uses a `weaver.yaml` configuration to enforce official OpenTelemetry policies (e.g., naming conventions, stability rules) even while iterating independently.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's not part of weaver.yaml now, right? do we want to make it?

Or should we ship GH actions and/or have some shareable workflows that repos would reuse or copy-paste?


- **Pinning**: An instrumentation library MUST specify the `schema_url` of the federated registry version it targets in its `Scope` metadata, via `get {Meter|Tracer|Logger}` operations.
- **Breaking Changes**: If an instrumentation library adopts a new major version of a federated registry that results in breaking changes to its OTLP output, the library ITSELF must perform a major version bump. For example, if `opentelemetry-java-instrumentation` moves from `jvm/v1` to `jvm/v2`, it must release a new major version of its instrumentation package.
- **Stable by Default**: Following OTEP 4813, instrumentation can be marked as stable once its code and OTLP output are production-ready, this means marking any federated registry as stable, in tandem with the library.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to allow stable (federated) registry to depend on unstable parts of otel conventions?

I think we can go either way.
If we don't allow it: they can take dependency on stable parts only and vendor in unstable parts if necessary.
If we allow it, they should be able to communicate breaking changes in unstable core via semver.

Either way, we need some forcing factors for them to

  • update to newer underlying core - it will be hard regardless
  • bring common concepts to core conventions

I'd rather not allow it at least initially and encourage to vendor in experimental stuff.


To solve the "cohesive whole" problem and provide obvious version conformance, OpenTelemetry will periodically publish **Platform Releases**.

A **Platform Release** is a manifest (using the schema format from OTEP 4815) that acts as a "BOM" (Bill of Materials). It does not contain new conventions itself but rather lists specific, tested-together versions of federated registries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would probably be the forcing factor to update federated registries I mentined in prev comment.
If library A depends on semconv core v1.40 and library B on semconv core v1.140 they probably can't coherently work in the same distro.

registries:
- schema_url: https://opentelemetry.io/schemas/semconv/1.42.0
- schema_url: https://opentelemetry.io/schemas/jvm/2.1.0
- schema_url: https://opentelemetry.io/schemas/http/1.15.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would we separate HTTP from core conventions? Probably not until it needs v2


A **Platform Release** is a manifest (using the schema format from OTEP 4815) that acts as a "BOM" (Bill of Materials). It does not contain new conventions itself but rather lists specific, tested-together versions of federated registries.

**Example Platform Release Manifest (`OpenTelemetry 2026.1`):**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome!

I also like the year as a major version - we should create expectation of some breaking changes on a predictable cadence

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is date ver no? This is not semver.

The Semantic Conventions SIG maintains the root `opentelemetry.io/schemas/` namespace. Any new "official" federated registry (e.g., `/jvm`, `/http`) must be an approved OpenTelemetry project and will use tooling provided for federated semantic convention SIGs.

For third-party or experimental registries, authors are encouraged to use their own domains (e.g., `acme.com/schemas/`) to avoid collisions. The `weaver` tool will also validate that a registry does not redefine attributes or signals already present in its dependencies, so any opentelemetry registry MUST depend on the core `semantic-conventions` registry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should maintain a list of federated conventions in semconv (or somewhere) for discoverability. And using it we can even have a weekly check that validates all federated registries together to find conflicts or tests them individually against latest core. If some collisions are found, we could automatically create issues and notify maintainers about conflicts.

It would also be a good forcing factor for conventions to stay in sync with core when they can.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added


1. **Incubating**: Registry exists outside the core, managed by a specific SIG. It uses a unique namespace, both for schema_url (e.g., `opentelemetry.io/schemas/jvm`) and for signals/attributes (e.g. `jvm`).
2. **Maturity Progression**: The federated registry progresses through `development` -> `beta` -> `stable` according to its own usage and feedback.
3. **Criteria for Promotion**: To be merged into core `semconv`, a federated registry MUST:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[update] I see you have it addressed later

assuming it's not required, what would be the motivation to even start bringing semconv to core?

I can imagine these:

  • offload maintenance of a stable thing to core repo
  • need to align between different federated repos (e.g. align db server with db client)
  • need to align compatibility for platform release

But unless something starts to break, I don't see people being too interested in this work. And I think it's usually fine, but it's likely that we'll grow into 5 different versions of some conventions across 5 different repos.

We should probably require sigs like GenAI to be language-agnostic to avoid it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely agree. I think that's the inevitability of federating and we'll need to balance cost / benefit here. I expect we'll have two types of fedaration:

  • jvm / go runtime like things where it's really a specific technology that needs to be addressed. Unlikely we'll need to merge this back in ever.
  • GenAi, db, http type things where they need to be cross language and cross cutting. These main want to come back into core, so their attributes can be re-used in further federated registries without as much dependnecy hell, but do not need to.


As OpenTelemetry's semantic conventions expand, a monolithic registry and versioning scheme create friction:
1. **Slow Evolution**: Highly specialized or domain-specific conventions (e.g., JVM metrics, cloud-provider-specific resources) are often gated by the slower stabilization process of the core registry.
2. **Coupled Breaking Changes**: A major version bump in one sub-domain (e.g., a total overhaul of database conventions) should not force the entire OpenTelemetry ecosystem to adopt a major version bump.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this propagate to natively instrumented libraries and what incentive do they have to move to newer conventions? There always is a transform layer running?

As OpenTelemetry's semantic conventions expand, a monolithic registry and versioning scheme create friction:
1. **Slow Evolution**: Highly specialized or domain-specific conventions (e.g., JVM metrics, cloud-provider-specific resources) are often gated by the slower stabilization process of the core registry.
2. **Coupled Breaking Changes**: A major version bump in one sub-domain (e.g., a total overhaul of database conventions) should not force the entire OpenTelemetry ecosystem to adopt a major version bump.
3. **Instrumentation Stability**: Instrumentation libraries need a clear way to declare stability for their OTLP output by pinning to specific versions of the conventions they implement, regardless of whether those conventions are "core" or "federated".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are conventions in a particular version all uniformly considered "stable"?


## Goals

1. **Independent Lifecycle**: Enable domain-specific semantic convention registries to have their own SemVer lifecycle.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the rules of semver here. Like dropping an attribute is "breaking" - what if it was optional?


1. **Independent Lifecycle**: Enable domain-specific semantic convention registries to have their own SemVer lifecycle.
2. **Instrumentation Pinning**: Allow instrumentation libraries to declare stability by pinning to specific federated registry versions.
3. **Platform Releases**: Provide a mechanism (Platform Releases) to bundle specific versions of federated registries into a "tested-together" cohesive set.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the acceptance criteria for "tested together", what if an instrumentation fails, is it removed from release?

1. **Independent Lifecycle**: Enable domain-specific semantic convention registries to have their own SemVer lifecycle.
2. **Instrumentation Pinning**: Allow instrumentation libraries to declare stability by pinning to specific federated registry versions.
3. **Platform Releases**: Provide a mechanism (Platform Releases) to bundle specific versions of federated registries into a "tested-together" cohesive set.
4. **Promotion Path**: Define a clear path for federated conventions to be consolidated into the core OpenTelemetry registry.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if end-users want to opt out of expensive / PII risking conventions. Do these still land in core? What is the final say for something deserving to be in core. I'm unfamiliar here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We allow expensive / PII in core, with opt_in today and there's guidance around it. I think this is an orthogonal concern to this proposal, but an important issue that needs a solution across various components of OpenTelemetry.

For this proposal, we continue with the current behavior we have.

Comment on lines +27 to +28
**The JVM Metrics Example**:
The JVM Metrics registry (e.g., `opentelemetry.io/schemas/jvm`) identifies a need to overhaul its metric names to align with a new runtime standard.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good example, but also centralized with the maintainers of Java. So it seems like a simpler example? It's harder to do this when there is no central vendor.

The JVM Metrics registry (e.g., `opentelemetry.io/schemas/jvm`) identifies a need to overhaul its metric names to align with a new runtime standard.

#### Registry Structure
A federated registry like `jvm-metrics` would contain a manifest, the convention definitions, and a policy enforcment github action.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where to these actions run - on OTEL repos only? How does it run on native libraries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we only "enforce" this structure for OTEL-owned repositories/conventions.

For native libraries, we provide open-telemetry/weaver as a tool they can use to participate in this federation, but we do not enforce any structure / policy unless they are planning to move their definitions back into the OpenTelemetry project.

#### Registry Requirements

- The registry MUST declare a dependency on core semantic conventions.
- A stable federated registry MUST NOT depend on unstable or experimental core conventions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the federated conventions need the unstable conventions for it to make sense - example being session / threading of agents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the federated convention would be marked "unstable" as well. Basically you can't declare a "child" convention stable unless the "parent" is also stable.

This doesn't preclude usage, just stability declaration.

#### Independent Versioning
- It releases `v2.0.0` of the `jvm` federated registry.
- This release is **completely independent** of the core `semconv` registry (which might still be at `v1.45.0`) and other registries like `http` or `messaging`.
- Users who want the new JVM metrics can opt-in by utilizing declarative configuration mechanisms (e.g., `version: 2`, `dual_emit: true`).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dual_emit prioritizes which verion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In OpenTelemetry - dual_emit will fire both conventions, i.e. you'll get both attributes, multiple metrics, etc.

@trask or @lmolkova may be able to speak more to how that works in practice.

- This release is **completely independent** of the core `semconv` registry (which might still be at `v1.45.0`) and other registries like `http` or `messaging`.
- Users who want the new JVM metrics can opt-in by utilizing declarative configuration mechanisms (e.g., `version: 2`, `dual_emit: true`).
This should implicitly update the `schema_url` of telemetry, allowing users to see which version is which and address versioning issues.
- Existing users of `v1.x.x` are unaffected and continue to see the old OTLP output.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which users are we talking about. The end-user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Instrumentation libraries "own" the stability of the OTLP they produce. To maintain this stability:

- **Pinning**: An instrumentation library MUST specify the `schema_url` of the federated registry version it targets in its `Scope` metadata, via `get {Meter|Tracer|Logger}` operations.
- **Breaking Changes**: If an instrumentation library adopts a new major version of a federated registry that results in breaking changes to its OTLP output, the library ITSELF must perform a major version bump. For example, if `opentelemetry-java-instrumentation` moves from `jvm/v1` to `jvm/v2`, it must release a new major version of its instrumentation package.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not so simple. Instrumentation is dependent on now two versions, the library it instruments AND the semcov that it tracks. What if the instrumentation needs to move a major version for library changes, then subsequently needs to backfill support for an older version. Now there is no tenable way to track both semcov breaking changes with library breaking changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I intend this to mean the opposite, or at least I don't see them as coupled.

  • If you make a breaking change in semconv, you need a major version bump on your library.
  • You may need major version bump on your library for other reasons, you do NOT need to bump semconv major version for this.

The key here is folks understand they need to do the first, and that the output signals of an instrumentation library are part of its stability.

Regarding tracking both semconv breaking + library breaking changes - I wanted to keep this simpler where you mostly just pay attention to library major version bumps, and then schema_url and backends can help you handle incompatibilities / conversions downstream.


- **Pinning**: An instrumentation library MUST specify the `schema_url` of the federated registry version it targets in its `Scope` metadata, via `get {Meter|Tracer|Logger}` operations.
- **Breaking Changes**: If an instrumentation library adopts a new major version of a federated registry that results in breaking changes to its OTLP output, the library ITSELF must perform a major version bump. For example, if `opentelemetry-java-instrumentation` moves from `jvm/v1` to `jvm/v2`, it must release a new major version of its instrumentation package.
- **Stable by Default**: Following OTEP 4813, instrumentation can be marked as stable once its code and OTLP output are production-ready, this means marking any federated registry as stable, in tandem with the library.
Copy link

@mikeldking mikeldking Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the library is unstable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would allow semconv to be marked as stable prior to libraries being fully stable. This is more to prevent the opposite in otel - libraries which are production-ready NOT declaring stability, because of a looming threat of semconv changes.

No. Engaging with `schema_url` and federated registries is optional. OpenTelemetry continues to work as-is for users who do not require automated schema transformation or validation. Existing observability ecosystems suffer from these same semantic fragmentation issues today, but they are typically only discovered at the storage or query layer. This proposal provides the metadata necessary to *address* these issues upstream, but it does not mandate that every SDK or Collector component must be schema-aware.

### 8. Won't this lead to "version fatigue" if instrumentation libraries have to major bump frequently to adopt federated registry changes?
No. Instrumentation libraries in OpenTelemetry are subject to the [specification's versioning and stability policies](/specification/versioning-and-stability.md). These policies strongly discourage frequent major version bumps and require minimum support periods (e.g., one year for contrib/instrumentation packages) for older major versions. This inherent bias towards stability ensures that instrumentation libraries do not adopt breaking changes from federated registries at a high cadence. Instead, maintainers will typically batch such changes or only adopt them when the value to users clearly outweighs the significant cost of a major release and the subsequent long-term support requirement.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem here is "cost" - the cost to one OTLP backend might be significant - it dictates a business deal being made in a quarter vs losing to a competitor that doesn't comply to Otel at all. This puts vendors that are all in on OTEL at risk as their upstream signals are being dictated by parties that have no actual financial overhead or incentives to move the standards forward.


This is a scenario that exists today in open source observability. As
`schema_url` becomes more widely adopted, we expect backends to support
automatic or semi-automatic (e.g agent-aided) transitions and translations to handle this problem.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can we say that backends will use these agent-aided transformations? That seems like something that certain vendors such as companies that have privileged foundation model access a competitive advantage. This seems unfair.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, and not the real intention here.

The agent-aided meant we may want to start leveraging agents within open-telemetry to define transitions at definition time, and publicizing these transitions for everyoen to consume without agents. We cannot have agent-in-the-loop on hot paths, that's intractable. Leveraging an agent to help improve the coverage of how many version transitions we can safely provide is what this was intended to mean.

The true goal is that we have an open model and easy-to-use transition "definition" that folks can consume to translate between versions of schema, or even (if we're successful) between different schema-urls entirely.

That's still experimental / long-term exploration for the project. For now, we have targeted version-bump transitions as something we believe we can provide automatic transition model that should NOT be costly unfair for the ecosystem to adopt.

### Automatic Schema Transformation

Tooling (like `weaver`) could automatically generate the necessary OTLP transformations when a user moves from a Platform Release `2026.1` to `2026.2`. We could also
leverage `weaver`'s MCP server to automatically generate OTLP transformation and configuration today as tooling to help with major version bumps needed in OpenTelemetry

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the weaver MCP server. How does this work?

Copy link
Contributor Author

@jsuereth jsuereth Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today it's pretty light. You run weaver mcp and it will load in a registry and allow the agent to interact with a version of semantics and lookup the model, definitions, notes, etc. It can also run live-check to enforce conformance on the version. We hope to expand this further to be more of an aide when defining conventions and allow more ad-hoc agent-initiated flows, e.g. "Help me create a semantic convention registry from this integration test"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants