Skip to content

Conversation

@pavolloffay
Copy link
Member

@pavolloffay pavolloffay commented Dec 15, 2025

Description

This is an alternative PR #14261

metadata.yaml :

schema:
  enabled: true
  # optional
  # config_type: "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/datadog/config.Config"

Notable implementation details

format: duration
  • JSON Schema duration: Expects an ISO 8601 string (e.g., "PT5M").
  • This is not compatible with golang Duration
  • Go time.Duration (Default): Marshals into an integer representing nanoseconds (e.g., 300000000000).
  • To satisfy "type": "string", "format": "duration" you need a custom wrapper around time.Duration that implements the json.Marshaler interface to convert the time into ISO 8601 format.

Solution

{
  "type": "string",
  "pattern": "^([+-]?(\\d+(\\.\\d*)?|\\.\\d+)(ns|us|µs|ms|s|m|h))+$",
  "example": "1h30m10s",
  "description": "A duration string (e.g., '10s', '1.5h'). Valid units: ns, us, ms, s, m, h."
}
mapstructure:",squash
  • The generator detects squash in the tag and treat such fields the same way it treats embedded structs - merging their properties into the parent schema instead of creating a nested object.
Config object uses non-standard name

If the config object uses non standard name (e.g. MyCfg) or comes from a different package it can be specified in the config

schema:
  enabled: true
  config_type: "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/datadog/config.Config"

Example: https://github.com/open-telemetry/opentelemetry-collector-contrib/compare/main...pavolloffay:schema-example?expand=1

additionalProperties

additionalProperties is set to true, which forbids properties that are not in the config and aligns with the collector behavior.

Open questions
How to handle required fields?

Validation is handled via Validator interface e.g. func (cfg *Config) Validate() error {.

How to handle default fields?

The factory.CreateDefaultConfig() creates a config with default fields.

There are 2 possible approaches:

  1. define the default as a tag jsonschema:"default=localhost:4317"
  2. mdatagen generates a tiny temporary "extractor" program which calls factory.CreateDefaultConfig(), stores the output which is used by the mdatagen schema generation.

Link to tracking issue

Fixes #9769
Fixes open-telemetry/opentelemetry-collector-contrib#42214
Implements: #13784
Updates: open-telemetry/opentelemetry-collector-contrib#24189

Testing

Documentation

@codecov
Copy link

codecov bot commented Dec 15, 2025

Codecov Report

❌ Patch coverage is 68.07388% with 121 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.56%. Comparing base (5a6fc8d) to head (93d9b96).

Files with missing lines Patch % Lines
cmd/mdatagen/internal/schemagen/analyzer.go 66.82% 49 Missing and 22 partials ⚠️
cmd/mdatagen/internal/schemagen/generator.go 80.14% 15 Missing and 12 partials ⚠️
cmd/mdatagen/internal/command.go 20.83% 18 Missing and 1 partial ⚠️
cmd/mdatagen/internal/samplereceiver/config.go 0.00% 4 Missing ⚠️

❌ Your patch check has failed because the patch coverage (68.07%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14288      +/-   ##
==========================================
- Coverage   91.80%   91.56%   -0.24%     
==========================================
  Files         676      679       +3     
  Lines       42415    42793     +378     
==========================================
+ Hits        38937    39182     +245     
- Misses       2420     2514      +94     
- Partials     1058     1097      +39     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good example to look at the generated schema

@pavolloffay pavolloffay force-pushed the ocb-component-schema-alternative branch from c7b9168 to a88d7b8 Compare December 16, 2025 15:03
@codspeed-hq
Copy link

codspeed-hq bot commented Dec 17, 2025

CodSpeed Performance Report

Merging this PR will not alter performance

Comparing pavolloffay:ocb-component-schema-alternative (93d9b96) with main (5a6fc8d)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 61 untouched benchmarks
⏩ 20 skipped benchmarks1

Footnotes

  1. 20 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@niwoerner
Copy link
Member

I really like that this approach attempts to capture the required fields. This would be useful for the validation. Unfortunately I fear that the detection would contain a considerable amount of false positives because the omitempty tag probably couldn't be taken as reliable source for that information. It's not set consistently in the components

@jkoronaAtCisco
Copy link
Contributor

It seems that we are working on solving the same problem, but using slightly different methods.

I have started working on a script that generates config schemas based on Go structs using AST parsing. This is part of a larger plan to introduce schemas to the OpenTelemetry collector. You can find more details in my PRs:

I think it makes sense to start a discussion on the preferred solution.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 7, 2026

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jan 7, 2026
Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach better and I think this can be merged roughly as is

@github-actions github-actions bot removed the Stale label Jan 8, 2026
@pavolloffay pavolloffay force-pushed the ocb-component-schema-alternative branch 3 times, most recently from a166f90 to f190d98 Compare January 9, 2026 15:57
@dmitryax
Copy link
Member

dmitryax commented Jan 9, 2026

Given that we’ve gone through many iterations, and that a similar solution has recently been merged (the schemagen tool), I’d like to propose a call to discuss the end result we want to have, align on a single approach, and avoid stepping on each other’s toes.

There are a few remaining points we need to clarify:

  1. Do we need to involve mdatagen?
  2. What should be the source of truth for the generated Go code, schema definitions, and documentation?
  3. Should we use references?

@pavolloffay @mx-psi @evan-bradley @jkoronaAtCisco how does it sound?

@pavolloffay pavolloffay force-pushed the ocb-component-schema-alternative branch from 3ba482b to 05fa4e0 Compare January 12, 2026 12:53
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
dmitryax added a commit to dmitryax/opentelemetry-collector that referenced this pull request Jan 15, 2026
Replaces open-telemetry#13784

This RFC proposes a roadmap for introducing configuration schemas to OpenTelemetry Collector components. It establishes a schema-first approach in which Go structs, JSON schemas, and documentation are all generated from a single YAML source of truth.

This RFC is the result of discussions among contributors involved in this effort:
- @atoulme
- @evan-bradley
- @jkoronaAtCisco
- @mx-psi
- @pavolloffay

Related Issues / PRs:
- open-telemetry/opentelemetry-collector-contrib#42214
- open-telemetry#9769
- open-telemetry#14288
- open-telemetry/opentelemetry-collector-contrib#27003
dmitryax added a commit to dmitryax/opentelemetry-collector that referenced this pull request Jan 15, 2026
Replaces open-telemetry#13784

This RFC proposes a roadmap for introducing configuration schemas to OpenTelemetry Collector components. It establishes a schema-first approach in which Go structs, JSON schemas, and documentation are all generated from a single YAML source of truth.

This RFC is the result of discussions among contributors involved in this effort:
- @atoulme
- @evan-bradley
- @jkoronaAtCisco
- @mx-psi
- @pavolloffay

Related Issues / PRs:
- open-telemetry/opentelemetry-collector-contrib#42214
- open-telemetry#9769
- open-telemetry#14288
- open-telemetry/opentelemetry-collector-contrib#27003
dmitryax added a commit to dmitryax/opentelemetry-collector that referenced this pull request Jan 15, 2026
Replaces open-telemetry#13784

This RFC proposes a roadmap for introducing configuration schemas to OpenTelemetry Collector components. It establishes a schema-first approach in which Go structs, JSON schemas, and documentation are all generated from a single YAML source of truth.

This RFC is the result of discussions among contributors involved in this effort:
- @atoulme
- @evan-bradley
- @jkoronaAtCisco
- @mx-psi
- @pavolloffay

Related Issues / PRs:
- open-telemetry/opentelemetry-collector-contrib#42214
- open-telemetry#9769
- open-telemetry#14288
- open-telemetry/opentelemetry-collector-contrib#27003
@dmitryax
Copy link
Member

dmitryax commented Jan 15, 2026

We had a call to discuss overall goal and agreed on a approach described in this PRF

github-merge-queue bot pushed a commit that referenced this pull request Jan 24, 2026
…4433)

Replaces
#13784

This RFC proposes a roadmap for introducing configuration schemas to
OpenTelemetry Collector components. It establishes a schema-first
approach in which Go structs, JSON schemas, and documentation are all
generated from a single YAML source of truth.

This RFC is the result of discussions among contributors involved in
this effort:
- @atoulme
- @evan-bradley
- @iblancasa
- @jkoronaAtCisco
- @mx-psi
- @pavolloffay

Related Issues / PRs:
-
open-telemetry/opentelemetry-collector-contrib#42214
- #9769
- #14288
-
open-telemetry/opentelemetry-collector-contrib#27003
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose component configuration with a JSON schema Improve otel collector configuration w/ JSON schema

6 participants