-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[chore][RFC] - Configuration Merging revamped #13256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
0508028
16c9389
a2a094e
b7f4070
3480040
4c7c865
b7f38ad
463756a
ae66fb5
74862cb
b3546e4
dc77505
c9caa0c
61af2b2
a227116
e0dd933
21caaf8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
| @@ -0,0 +1,274 @@ | ||||
| # Configuration merging | ||||
|
|
||||
| ## Background | ||||
|
|
||||
| As part of issue [#8754](https://github.com/open-telemetry/opentelemetry-collector/issues/8754), a new feature gate has been introduced to support merging component lists instead of replacing them ([first PR](https://github.com/open-telemetry/opentelemetry-collector/pull/12097)). This enhancement enables configurations from multiple sources to be combined, preserving all defined components in the final configuration. | ||||
|
|
||||
| More information about this feature can be found in the [confmap README](https://github.com/open-telemetry/opentelemetry-collector/blob/d4539dd6b4e554e15066226fa975b156af7b1510/confmap/README.md#experimental-append-merging-strategy-for-lists). | ||||
|
|
||||
| The main motivation for this change was to allow users to define configuration fragments in different sources, and have them merged in such a way that all specified components are included under `service::pipeline` in the final configuration. | ||||
|
|
||||
| Previously, we relied on Koanf’s default merging strategy, which overrides static values and slices (such as strings, numbers, and lists). This behavior often resulted in configurations being unintentionally overwritten when merged from multiple sources. | ||||
|
|
||||
| This issue has been highlighted in several discussions and feature requests: | ||||
| - https://github.com/open-telemetry/opentelemetry-collector/issues/8394 | ||||
| - https://github.com/open-telemetry/opentelemetry-collector/issues/8754 | ||||
| - https://github.com/open-telemetry/opentelemetry-collector/issues/10370 | ||||
|
|
||||
| ## Motivation and Scope | ||||
|
|
||||
| We’ve already implemented a feature gate and foundational logic that supports merging lists across configuration files. Currently, this logic is hardcoded to merge lists only for specific keys: receivers, exporters, and extensions. The relevant implementation can be found [here](https://github.com/open-telemetry/opentelemetry-collector/blob/main/confmap/internal/merge.go). | ||||
|
|
||||
| The current implementation lacks flexibility. Ideally, users should be able to specify which configuration paths should be merged, rather than relying on hardcoded defaults. | ||||
| This RFC proposes extending the existing functionality by introducing a user-configurable mechanism to define merge behavior. | ||||
|
|
||||
| This RFC builds on top of feedback gathered from the [original PR](https://github.com/open-telemetry/opentelemetry-collector/pull/12097). | ||||
| More specifically, this RFC aims to: | ||||
| 1. Add an option to specify which configuration paths should be merged. | ||||
| 2. Introduce support for prepend and append operations when merging list values: | ||||
| - This is good to have for lists that rely on certain ordering, such as: | ||||
| - `processors` | ||||
| - `transformprocessor` statements | ||||
|
|
||||
| ## Proposed approaches: | ||||
|
|
||||
| ### Approach 1 (Recommended): Use yaml tags | ||||
|
|
||||
| The first approach relies on the concept of [yaml tags](https://tutorialreference.com/yaml/yaml-tags). We can specify a custom tag in our configuration files to indicate the lists we want to merge. | ||||
|
|
||||
| Consider following two configurations: | ||||
|
|
||||
| ```yaml | ||||
| #main.yaml | ||||
| receivers: | ||||
| ... | ||||
| exporters: | ||||
| ... | ||||
| extensions: | ||||
| extension1: | ||||
|
|
||||
| service: | ||||
| extensions: [extension1] | ||||
| pipelines: | ||||
| logs: | ||||
| receivers: [...] | ||||
| exporters: [...] | ||||
| ``` | ||||
|
|
||||
| ```yaml | ||||
| #extra.yaml | ||||
| extensions: | ||||
| extension2: | ||||
|
|
||||
| service: | ||||
| extensions: !mode=append [extension2] | ||||
VihasMakwana marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
| ``` | ||||
|
|
||||
| After running the collector with above configurations, the `service::extensions` path will get merged and final configuration will look like: | ||||
|
|
||||
| ```yaml | ||||
| #final.yaml | ||||
| receivers: | ||||
| ... | ||||
| exporters: | ||||
| ... | ||||
| extensions: | ||||
| extension1: | ||||
| extension2: | ||||
|
|
||||
| service: | ||||
| extensions: [extension1, extension2] | ||||
| pipelines: | ||||
| logs: | ||||
| receivers: [...] | ||||
| exporters: [...] | ||||
| ``` | ||||
|
|
||||
| This approach completely relies on the configuration files and doesn't introduce any command line option. | ||||
| Internally, we will loop through the yaml tree and fetch the paths we want to merged based on user-defined tags, then merge the lists found at those paths. | ||||
|
|
||||
| Our "custom" yaml tag will be in the format of URI query parameters. For starters, we can support following options: | ||||
| 1. `mode`: | ||||
| - This setting will control the ordering of merged list. | ||||
| - The value will be either of `append` or `prepend`. | ||||
| - Default: `append` | ||||
| 2. `duplicates`: | ||||
| - This setting controls the duplication of elements in the final list. | ||||
| - If the user wants to allow duplicates, they can simply turn this flag on. | ||||
| - Default: `false` | ||||
| 3. `recursive`: | ||||
| - This setting controls the merging of child elements of the given yaml path. | ||||
| - This is useful if user wants to merge all the lists under a give sub-tree. | ||||
| - For eg. merging all the lists under the `service` section. | ||||
|
|
||||
| #### Examples of first approach | ||||
|
|
||||
| 1. _Merge the `service::extensions` list_: | ||||
|
|
||||
| ```yaml | ||||
| #main.yaml | ||||
| receivers: | ||||
| ... | ||||
| exporters: | ||||
| ... | ||||
| extensions: | ||||
| extension1: | ||||
|
|
||||
| service: | ||||
| extensions: [extension1] | ||||
| pipelines: | ||||
| logs: | ||||
| receivers: [...] | ||||
| exporters: [...] | ||||
|
|
||||
| --- | ||||
|
|
||||
| #extra.yaml | ||||
| extensions: | ||||
| extension2: | ||||
|
|
||||
| service: | ||||
| extensions: !mode=append [extension2] | ||||
| ``` | ||||
|
|
||||
| ```yaml | ||||
| #final.yaml | ||||
| receivers: | ||||
| ... | ||||
| exporters: | ||||
| ... | ||||
| extensions: | ||||
| extension1: | ||||
| extension2: | ||||
|
|
||||
| service: | ||||
| extensions: [extension1, extension2] | ||||
| pipelines: | ||||
| logs: | ||||
| receivers: [...] | ||||
| exporters: [...] | ||||
| ``` | ||||
|
|
||||
| 2. _Merge all the lists under the `service::*` section_: | ||||
|
|
||||
|
|
||||
| ```yaml | ||||
| #main.yaml | ||||
| receivers: | ||||
| ... | ||||
| exporters: | ||||
| ... | ||||
| extensions: | ||||
| extension1: | ||||
|
|
||||
| service: | ||||
| extensions: [extension1] | ||||
| pipelines: | ||||
| logs: | ||||
| receivers: [...] | ||||
| exporters: [...] | ||||
|
|
||||
| --- | ||||
|
|
||||
| #extra.yaml | ||||
| extensions: | ||||
| extension2: | ||||
| receiver: | ||||
| receiver2: | ||||
|
|
||||
| service: !mode=append&recursive=true | ||||
| extensions: [extension2] | ||||
| pipelines: | ||||
| logs: | ||||
| receivers: [receiver2] | ||||
| ``` | ||||
|
|
||||
| ```yaml | ||||
| #final.yaml | ||||
| receivers: | ||||
| ... | ||||
| receiver2: | ||||
| exporters: | ||||
| ... | ||||
| extensions: | ||||
| extension1: | ||||
| extension2: | ||||
|
|
||||
| service: | ||||
| extensions: [extension1, extension2] | ||||
| pipelines: | ||||
| logs: | ||||
| receivers: [..., receiver2] | ||||
| exporters: [...] | ||||
| ``` | ||||
|
|
||||
| ### Approach 2: URI params | ||||
|
|
||||
| The proposed approach will rely on concept of URI query parameters([_RFC 3986_](https://datatracker.ietf.org/doc/html/rfc3986#page-23)). Our configuration URIs already adhere to this syntax and we can extend it to support query params instead adding new CLI flags. | ||||
|
|
||||
| For now, the new merging strategy is only enabled under `confmap.enableMergeAppendOption` gate. If user specifies the options and tries to run the collector without gate, we will merge as per default behaviour. | ||||
|
|
||||
| We will support new parameters to config URIs as follows: | ||||
| 1. `merge_paths`: A comma-separated list of glob patterns which will be used while config merging | ||||
| - This setting will control the paths user wants to merge from the given config. | ||||
| - Example: | ||||
| - `otelcol --config main.yaml --config extra.yaml?merge_paths=service::extensions,service::**::receivers` | ||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think using the query params is fine. However, we need to keep in mind that some confmap providers use the same pattern to pass extra information. With this approach, they won't be able to use these keys which is probably fine. If anyone else have any other ideas, please share. cc @open-telemetry/collector-approvers
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, we can extract the parameters here opentelemetry-collector/confmap/resolver.go Line 176 in 6aa2b81
Then we can remove our parameters from URI and let resolver do its job: u, _ := url.Parse("otel.yaml?key=val&merge_mode=append")
q := u.Query()
q.Del("merge_mode")
u.RawQuery = q.Encode()
// uri will now be otel.yaml?key=val&key2=val2
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We discussed this in this RFC and came up with a few issues with query parameters that we'll want to consider before going that route. @dmitryax I believe you, @mx-psi, and I had an offline conversation at KubeCon and tacitly agreed we should go the route of configuring these inside a config file (this option in the RFC). Do you still think we should go this route? I think we should probably at least explore it before deciding on an approach.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @evan-bradley @dmitryax approach 1 does not deal with params, does this resolve your concerns here?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't mind calling out the "Configuring Confmap Providers" RFC just to make it clear there are other options we can explore, but if approach 1 is our preference, I'm good with what's here.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that's fair |
||||
| - In this example, we will merge the list of extensions and receivers from pipeline, excluding lists in the rest of the config. | ||||
| - `otelcol --config main.yaml --config ext.yaml?merge_paths=service::extensions --config rec.yaml?merge_paths=service::**::receivers` | ||||
| - In this example, we will merge all list of extensions from `ext.yml` and list of receivers from `rec.yaml`, excluding lists in the rest of the config. | ||||
| 2. `merge_mode`: One of `prepend` or `append`. | ||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We probably need more options. For example, if I want to add a receiver in a merging config while it may be already defined in the base config. Also, we need the default
Also, I would call the options
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dmitryax I see. I'll update the RFC
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you touch a good point there with the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For service components, duplicates will cause error.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Right. For example, I have a list of receivers in a basic config file which I don't control. Then, I want to add another receiver and override it in case it's already defined in the basic config. In that case
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can start with a small number of options, in any case we can discuss that in the implementation phase |
||||
| - This setting will control the ordering of merged list. | ||||
mx-psi marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
|
|
||||
| #### Examples of second approach | ||||
|
|
||||
| Here are some examples: | ||||
|
|
||||
| 1. _Append to default mergeable components_: | ||||
| ```bash | ||||
| otelcol --config=main.yaml --config=extra_components.yaml?merge_mode=append --feature-gates=confmap.enableMergeAppendOption | ||||
| ``` | ||||
|
|
||||
| - After running above command, the final configuration will include: | ||||
| - Merged component(s) (`receivers`, `exporters` and `extensions`) from `extra_components.yaml` | ||||
|
|
||||
| 2. _Specify exact paths for merging_: | ||||
| ```bash | ||||
| otelcol \ | ||||
| --config=main.yaml \ | ||||
| --config=extra_extension.yaml?merge_mode=append&merge_paths=service::extensions \ | ||||
| --config=extra_receiver.yaml?merge_mode=append&merge_paths=service::**::receivers \ | ||||
| --feature-gates=confmap.enableMergeAppendOption | ||||
| ``` | ||||
|
|
||||
| - After running above command, the final configuration will include: | ||||
| - Merged extension(s) from `extra_extension.yaml` | ||||
| - Merged receiver(s) from `extra_receiver.yaml` | ||||
|
|
||||
|
|
||||
| 3. _Prepend processors_: | ||||
| ```bash | ||||
| otelcol --config=main.yaml --config=extra_processor.yaml?merge_mode=prepend&merge_paths=service::**::processors --feature-gates=confmap.enableMergeAppendOption | ||||
| ``` | ||||
|
|
||||
| - After running above command, the final configuration will include: | ||||
| - Merged processor(s) from `extra_processor.yaml`, but prepend the existing list. | ||||
|
|
||||
| 4. _Exclude a config file from lists merging process_: | ||||
| ```bash | ||||
| otelcol --config=main.yaml --config=extra_components.yaml?merge_mode=append --config override_components.yaml --feature-gates=confmap.enableMergeAppendOption | ||||
| ``` | ||||
|
|
||||
| - In the above command, we have no specified any options for `override_components.yaml`. Hence, it will override all the conflicting lists from previous configuration, which is the default behaviour. | ||||
|
|
||||
| ## Open questions | ||||
|
|
||||
| - What to do if an invalid option is provided for `merge_mode` or `merge_paths`? | ||||
evan-bradley marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
| - I can think of two possibilities: | ||||
| 1. Error out. | ||||
| 2. Log an error and merge the default way | ||||
| - What to do if an invalid query param is provided in config URI? | ||||
| - In this case, I strongly feel that we should error out. | ||||
|
|
||||
| ## Extensibility | ||||
|
|
||||
| This URI-based approach is highly extensible. In the future, it can enable advanced operations such as map overriding. Currently, it's impossible to do so. | ||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed this over Slack with Vihas but mentioning it here: it would be good to make sure that our officially released Helm charts and operator do not break these YAML tags (they shouldn't but I think it's a good check to do before deciding)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reached out on #otel-helm on CNCF slack for more eyes on this. I'll run some tests on this and keep you updated.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mx-psi Unfortunately, YAML tags are not preserved in helm chart. I think this is due to the fact that our helm chart deals with
yamltodictconversion internally and tags are lost in this process.The same applies to the operator. The operator marshals the config into a struct and internally, it is
map[string]interface{}. Hence, the tags are lost in the conversion.That being said, do users need this feature in helm/operator? Our helm chart internally deals with all the merging and appending to the lists, when necessary.
For operator, users deploy a single configuration like following:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^ @mx-psi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From 2025-09-10 Collector SIG meeting it seems like this should not be a concern but let's wait a bit more for the Operator SIG to answe