OTEP: Federated SemConv and Schema v2#4815
OTEP: Federated SemConv and Schema v2#4815lmolkova wants to merge 17 commits intoopen-telemetry:mainfrom
Conversation
jsuereth
left a comment
There was a problem hiding this comment.
Great draft!
Added my initial thoughts. Let me know if you need help writing any section.
| ### Dependency resolution mechanism | ||
|
|
||
| The number of direct dependencies is initially limited to one. Conflicts are not allowed | ||
| (Weaver resolution fails). This may change in the future. |
There was a problem hiding this comment.
This will change in the future - Just haven't had time to sort our "import" conflict resolution yet...
| - TODO: stable and not stable publishing. | ||
|
|
||
| [Decentralized conventions example](https://github.com/open-telemetry/opentelemetry-weaver-examples/pull/33). | ||
| TODO: do we need collector prototype for schema transformation? |
There was a problem hiding this comment.
@lmolkova WDYT of just having a jinja-template for taking a schema and creating transform processor configuration with OTTL for all the deprecated blocks? I believe that should be sufficient to prove having code do the same thing is viable
There was a problem hiding this comment.
challenge accepted: open-telemetry/opentelemetry-weaver-examples#36
tigrannajaryan
left a comment
There was a problem hiding this comment.
Thanks for working on this @lmolkova !
| # future | ||
| # diff_url: ... | ||
| # all_in_one_url: https://github.com/open-telemetry/semantic-conventions/archive/refs/tags/v1.39.0-dev.tar.gz | ||
| ``` |
There was a problem hiding this comment.
file_format 1.x had a versions section. Do we remove it in 2.0.0?
There was a problem hiding this comment.
yes, added "Difference with `file_format: 1.1.0" section in the 8e672e1
TL;DR: we should be able to list versions with https://otel.io/schemas/semconv and explore each of them if needed. Upgrades don't need to download all versions
| all attributes along with signal definitions and refinements. It is optimized for | ||
| distribution and in-memory representation. | ||
|
|
||
| *Resolved* schema for this metric looks like: |
There was a problem hiding this comment.
Do we think yaml is the right format for resolved schemas? Is it true that resolved schema's purpose is primarily machine consumption? Do we want to consider a binary format (e.g. ProtoBuf) for it, so that it is more efficient?
There was a problem hiding this comment.
Maybe in the future. It should be easy to support json or protobuf with Accept header, it could be added incrementally without changing data model.
So far, the schema URL has been HTTP only, and it has not been a problem. HTTP is much easier to work with and host, especially since it is a static set of files that can be placed on a CDN or stored in blob storage.
|
|
||
| ### Dependency resolution mechanism | ||
|
|
||
| The number of direct dependencies is initially limited to one. This will change in the future. |
There was a problem hiding this comment.
Do we think multiple dependencies are necessary in practice? They open up the dependency conflict possibilities as pointed it out below. If we don't have a clear demand should we leave it out for now?
There was a problem hiding this comment.
I think it's inevitable. Imagine this:
- You have java application
- Java-instrumentation repo publishes conventions for java-specific things. It depends on OTel semconv
- You use a 3rd party library with native instrumentation that publishes their own conventions. It depends on OTel semconv
- You have some custom telemetry that depends on OTel semconv and you document it
- Now you want to document what your application emits: you take dependency on OTel semconv, Java-instr, and 3rd party lib conventions.
- Boom
Given that we really want to federate conventions, and that we want both collector and java-instr to define their own very soon, we have to provide multi-dependency story. It's maybe not P0, but P1.
| ``` | ||
|
|
||
| The schema version 1.N includes information about how to upgrade from v1.N-M to v1.N. | ||
| This approach is limited to one major version and covers upgrades only. |
There was a problem hiding this comment.
My very early thinking on schemas was that we want to support for upgrades and downgrades. Do you think downgrades are unncessary?
There was a problem hiding this comment.
no, I think both are necessary, they could be supported in the future - added a point in the Schema transformations evolution section.
I think we need to describe transformations separately from definitions so that we can describe
v1 <-> v2, ecs <-> semconv, prometheus <-> otel, etc
This migration option is a replacement for current mechanism, but nothing stops us from evolving it further.
| This may involve two-step transformation when a version range is covered by both the old and the new | ||
| schema files. | ||
|
|
||
| #### Migration option 1: upgrades based on resolved schema only |
There was a problem hiding this comment.
Do we need to choose between the 3 options in our design or all 3 options are supported and can be chosen by the end user?
There was a problem hiding this comment.
Added clarification in 8e672e1
TL;DR:
Option 1 is the almost identical replacement to current scope of schema transformations. It's by no means feature complete - migration option 3 is for us (OTel) to evolve it in the future.
Option 2 is for consumers (probably vendors) who might have implemented something like schema transformations and might want to keep using diffs.
Let me know if I can make it more clear
Proposes a new telemetry schema format so OTel Collector, instrumentation libs, and third parties can publish their own conventions with dependencies on OTel semconv.
What's new:
1.39.0(stable only) vs1.39.0-dev(everything)opentelemetry.io/schemas/{component}/{version}for per-repo schemasBreaking:
1.1.0(the diff format)