Skip to content

Should file configuration merge environment variable configuration? #3752

Closed
@jack-berg

Description

The conversation about whether file configuration should completely ignore the sdk environment variable scheme came up in #3744, but that PR doesn't actually contain any language related to this.

The original file configuration OTEP stated:

Interpret the configuration model and return SDK TracerProvider, MeterProvider, LoggerProvider which strictly reflect the configuration object's details and ignores the opentelemetry environment variable configuration scheme.

As mentioned here, file configuration doesn't actually contain language describing this behavior. It was included originally included in #3437 but was lost in the PR review shuffle - accidentally, not in response to feedback.

@tedsuo argues in favor file configuration respecting env vars with:

The common expectation among developers is that env vars will automatically overwrite config parameters. If we do it the other way, I am concerned that until the end of time we will have a steady stream of users lodging issues about this and then becoming very frustrated when we explain that they need to modify the config file to use an env var.

The reason that these users will be upset is that they are in a situation where having to define the env vars in the config file is a non-starter. Their use case is that for some reason they either can't get at or are not allowed to modify the config file template, but they really need to override a parameter.

For reference, this is an issue that has come up on many OSS projects I have been involved with where it is common to have both operators and application developers wanting to configure the same thing. Often it's some kind of emergency situation where the person with the rights to change the config file is unavailable.

I should note that of course the opposite situation could be true, where for some reason you want to disable an env var but don't have access to it. But it's probably easier to give users the ability to disable an env var via the config file than it is to give them the ability to disable a config parameter via an env var.

@MrAlias argues in favor of ignoring env vars with:

To be fair, from experience, you're going to get people complaining either way. If you choose environment variable priority over a configuration users will complain that their deployment was altered and failed when an environment variable was set that took precedence.

However, if you make environment variables take precedence you will also need to make some pretty sever and subjective choices on how they are mapped to a config. Do the BSP environment variables apply to all batch span processors or just one? Is the sampler environment variable use for all tracer providers in the config, even ones that specify alternate samplers? Should propagators be merged or overridden? If an exporter is defined by environment, does that stop the console logging exporter used for debugging as well as all the other exporters defined in configuration?

Ultimately, I think the current changes are going to be the most appropriate. They allow users to make their own choice in precedence without making subjective choices for them on how to map things. If a user want environment variables to take precedence, all they need to do is use the OTel environment keys in the related parts of their configuration. In doing so they will answer each of the above questions their own way.

@trask supports the feeling of users expecting env vars to override file configuration, but also says merging configuration from multiple sources is hard:

This is my feeling as well, especially for things like:

OTEL_SDK_DISABLED
OTEL_RESOURCE_ATTRIBUTES
OTEL_SERVICE_NAME
OTEL_LOG_LEVEL
OTEL_PROPAGATORS
OTEL_TRACES_SAMPLER_ARG
I totally understand the nightmare that is merging configuration from multiple sources though.

I wonder if we would have created many of the other env vars (e.g. OTEL_BSP_, OTEL_BLRP_) if we had configuration file support from the beginning? And if so, maybe we can deprecate those other env vars in favor of configuration file?

This topic came up several times during the lengthly review of the file configuration OTEP. Below are links to a number of and relevant points:

open-telemetry/oteps#225 (comment)

Layering of config as described below would make it more difficult to reason about what my program is actually being run with.

Additionally, perhaps give them a helper config file that uses env var substitution in the right places so that they can migrate easily (and still get a warning until they get rid of env variables and move everything to the config file).

open-telemetry/oteps#225 (comment)

What about 'Solution 3: fail when both environment configuration and file configuration are present'?

I think we could log a warning when we detect this, but failing is too strong. Consider the implications if a user is operating in an environment they don't fully control (i.e. where an ops team configures environment variables by default which they extend / layer on top of).

open-telemetry/oteps#225 (comment)

I have a (rather strong) opinion that setting set via env vars has higher priority (takes precedence) over a setting set via configiuration file and it should be marked as a goal.

Any scheme where environment variables have priority over a config file will require some sort of standard mapping between the environment variable schema and file config scheme. IMO, its impossible to define such a mapping which is intuitive in all cases, so better not to try.

Nothing is forcing users to use file based config - its opt in. If they do opt in, they're opting into the documented behavior in which the config file represents the source of truth for configuration. If they wish to customize the experience with additional layers / overrides, they have a couple of tools:

  1. They can use the fact the Configure(config) API accepts a config model as an argument, and provide their own customizations to the model after the initial parsing of the file via Parse(file). An example of such a customization would be to interpret environment variables and apply them to the model in a way they decide makes sense.
  2. They can use environment variable substitution to reference environment variables directly in a configuration file.

Update 3/15/2024

The current state of this issue is:

  • The configuration working group proposed and executed a process to move forward after several months of failing to reach consensus
  • As a part of that process, the configuration maintainers reviewed proposals and have recommended moving forward with option 2.b as described in this comment
  • As a part of that process, people who disagree with the recommendation can / should escalate to a TC decision

Update 3/28/2024

Please see this comment updating the status of the issue:

Per @tedsuo’s request, we discussed this issue in the 3/24/27 TC meeting and have made a decision:
Generally, we will follow @trask's comment, proceeding with this PR with a few changes:

  • Rename OTEL_CONFIG_FILE to OTEL_EXPERIMENTAL_CONFIG_FILE, reflecting the fact that the semantics around how the value of env var are subject to breaking changes as the file configuration spec and schema continue to evolve.
  • Ensure that env vars which don’t interop with file config are deprecated when file config is ready for stabilization, reflecting that we do not want to recommend multiple competing configuration stories. This could be ensured via an explicit note in the markdown, or a blocking issue - both achieve the same effect. Deprecate environment variables which do not interoperate with file config when ready for stabilization #3967
  • Ensure that file config has an interop where platforms (i.e. Azure functions, otel operator, etc) contribute to config. We should proceed with #3948 without being prescriptive about how that mechanism works. In the TC meeting, 4 distinct solutions were discussed which had different tradeoffs and limitations. It is clear that we still need to learn more about the requirements and constraints of this use case and let the findings inform the solution. The config working group should prioritize this discussion, but an answer shouldn’t block this PR. We should open a new issue to track the requirements and discuss solutions, and ensure that we treat that issue as blocking for any sort of stabilization effort (although it should ideally be solved much sooner). Ensure that file config has solution for platforms which contribute to config #3966

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

[label deprecated] triaged-accepted[label deprecated] Issue triaged and accepted by OTel community, can proceed with creating a PRspec:miscellaneousFor issues that don't match any other spec label

Type

No type

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions