Skip to content

OTEP: Federated SemConv and Schema v2#4815

Open
lmolkova wants to merge 17 commits intoopen-telemetry:mainfrom
lmolkova:semconv-schema-v2
Open

OTEP: Federated SemConv and Schema v2#4815
lmolkova wants to merge 17 commits intoopen-telemetry:mainfrom
lmolkova:semconv-schema-v2

Conversation

@lmolkova
Copy link
Member

@lmolkova lmolkova commented Jan 2, 2026

Proposes a new telemetry schema format so OTel Collector, instrumentation libs, and third parties can publish their own conventions with dependencies on OTel semconv.

What's new:

  • Schema URLs now return a manifest with metadata + link to resolved schema (single file with everything baked in)
  • Support for arbitrary registries that can depend on OTel conventions
  • Stable vs dev builds: 1.39.0 (stable only) vs 1.39.0-dev (everything)
  • New URL pattern: opentelemetry.io/schemas/{component}/{version} for per-repo schemas

Breaking:

  • Stop publishing schema file format 1.1.0 (the diff format)

@lmolkova lmolkova mentioned this pull request Jan 2, 2026
3 tasks
Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great draft!

Added my initial thoughts. Let me know if you need help writing any section.

@lmolkova lmolkova changed the title SemConv schema v2: initial draft [WIP] SemConv schema v2 Jan 6, 2026
### Dependency resolution mechanism

The number of direct dependencies is initially limited to one. Conflicts are not allowed
(Weaver resolution fails). This may change in the future.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will change in the future - Just haven't had time to sort our "import" conflict resolution yet...

- TODO: stable and not stable publishing.

[Decentralized conventions example](https://github.com/open-telemetry/opentelemetry-weaver-examples/pull/33).
TODO: do we need collector prototype for schema transformation?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmolkova WDYT of just having a jinja-template for taking a schema and creating transform processor configuration with OTTL for all the deprecated blocks? I believe that should be sufficient to prove having code do the same thing is viable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmolkova lmolkova changed the title [WIP] SemConv schema v2 SemConv Schema v2 Feb 10, 2026
@lmolkova lmolkova marked this pull request as ready for review February 10, 2026 03:52
@lmolkova lmolkova requested review from a team as code owners February 10, 2026 03:52
@lmolkova lmolkova changed the title SemConv Schema v2 OTEP: Federated SemConv and Schema v2 Feb 10, 2026
Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @lmolkova !

# future
# diff_url: ...
# all_in_one_url: https://github.com/open-telemetry/semantic-conventions/archive/refs/tags/v1.39.0-dev.tar.gz
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file_format 1.x had a versions section. Do we remove it in 2.0.0?

Copy link
Member Author

@lmolkova lmolkova Feb 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, added "Difference with `file_format: 1.1.0" section in the 8e672e1

TL;DR: we should be able to list versions with https://otel.io/schemas/semconv and explore each of them if needed. Upgrades don't need to download all versions

all attributes along with signal definitions and refinements. It is optimized for
distribution and in-memory representation.

*Resolved* schema for this metric looks like:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think yaml is the right format for resolved schemas? Is it true that resolved schema's purpose is primarily machine consumption? Do we want to consider a binary format (e.g. ProtoBuf) for it, so that it is more efficient?

Copy link
Member Author

@lmolkova lmolkova Feb 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in the future. It should be easy to support json or protobuf with Accept header, it could be added incrementally without changing data model.

So far, the schema URL has been HTTP only, and it has not been a problem. HTTP is much easier to work with and host, especially since it is a static set of files that can be placed on a CDN or stored in blob storage.


### Dependency resolution mechanism

The number of direct dependencies is initially limited to one. This will change in the future.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think multiple dependencies are necessary in practice? They open up the dependency conflict possibilities as pointed it out below. If we don't have a clear demand should we leave it out for now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's inevitable. Imagine this:

  1. You have java application
  2. Java-instrumentation repo publishes conventions for java-specific things. It depends on OTel semconv
  3. You use a 3rd party library with native instrumentation that publishes their own conventions. It depends on OTel semconv
  4. You have some custom telemetry that depends on OTel semconv and you document it
  5. Now you want to document what your application emits: you take dependency on OTel semconv, Java-instr, and 3rd party lib conventions.
  6. Boom

Given that we really want to federate conventions, and that we want both collector and java-instr to define their own very soon, we have to provide multi-dependency story. It's maybe not P0, but P1.

```

The schema version 1.N includes information about how to upgrade from v1.N-M to v1.N.
This approach is limited to one major version and covers upgrades only.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My very early thinking on schemas was that we want to support for upgrades and downgrades. Do you think downgrades are unncessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, I think both are necessary, they could be supported in the future - added a point in the Schema transformations evolution section.

I think we need to describe transformations separately from definitions so that we can describe
v1 <-> v2, ecs <-> semconv, prometheus <-> otel, etc

This migration option is a replacement for current mechanism, but nothing stops us from evolving it further.

This may involve two-step transformation when a version range is covered by both the old and the new
schema files.

#### Migration option 1: upgrades based on resolved schema only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to choose between the 3 options in our design or all 3 options are supported and can be chosen by the end user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added clarification in 8e672e1

TL;DR:
Option 1 is the almost identical replacement to current scope of schema transformations. It's by no means feature complete - migration option 3 is for us (OTel) to evolve it in the future.
Option 2 is for consumers (probably vendors) who might have implemented something like schema transformations and might want to keep using diffs.

Let me know if I can make it more clear

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants