Skip to content

feat: config from env#406

Open
noxware wants to merge 17 commits into
dial9-rs:mainfrom
noxware:302-config-from-env
Open

feat: config from env#406
noxware wants to merge 17 commits into
dial9-rs:mainfrom
noxware:302-config-from-env

Conversation

@noxware
Copy link
Copy Markdown
Contributor

@noxware noxware commented May 15, 2026

Description

This PR gathers inspiration from the proposal in its issue, from the production_use.rs example, from this real world usage and from the programmatic builder itself, to define a set of standard env vars and a function that reads them to launch dial9 with zero code.

Out of scope

  • DIAL9_ENABLED_FILE requires a special setup for the watcher.
  • Mixing config from env with programmatic builder settings.

Closes

#302

Copy link
Copy Markdown
Contributor

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the one piece I'd like to cover here from day-0 is a way to have profiling on but Tokio events off. (#351 )

the other features can probably be saved for later

@noxware noxware force-pushed the 302-config-from-env branch from 2d1436c to f5654a0 Compare May 16, 2026 01:30
const ENV_DIAL9_TASK_DUMP_ENABLED: &str = "DIAL9_TASK_DUMP_ENABLED";
const ENV_DIAL9_TASK_DUMP_IDLE_THRESHOLD_MS: &str = "DIAL9_TASK_DUMP_IDLE_THRESHOLD_MS";

const DEFAULT_ENABLED: bool = false;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: When configuring the library manually, from Rust code, we default to enabled, but, by looking at the issue and production_use.rs I decided to default to disabled when using from_env.

const DEFAULT_CPU_PROFILE_ENABLED: bool = cfg!(all(target_os = "linux", feature = "cpu-profiling"));
const DEFAULT_SCHEDULE_PROFILE_ENABLED: bool =
cfg!(all(target_os = "linux", feature = "cpu-profiling"));
const DEFAULT_TASK_DUMP_ENABLED: bool = false;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaults to false as it seems to be expensive.

trace_dir: PathBuf,

// None means the underlying RotatingWriter builder owns the default.
rotation_period: Option<Duration>,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: The issue showcases 15 as the proposed default, but in my implementation, I'm simply letting the builder use its existing default (which seems to be 60).

In general, I tried to avoid having different defaults to the ones that are already in-place. The only exception was DIAL9_ENABLED as mentioned in other comment.

}

#[derive(Debug)]
struct ParsedEnvConfig {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split "parsed" and "resolved" mainly to get better observability in tests and to reduce the responsibility of the parse_env_config function but I know it adds some duplication.

"unknown-service".to_string()
}

fn warn(message: fmt::Arguments<'_>) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I added this function to mimic the error logging logic from build_or_disabled but I would personally just do tracing::warn!(), wdyt?

/// `from_env()` behavior. The returned config is built with
/// [`Dial9ConfigBuilder::build_or_disabled`], so writer setup failures are
/// logged and downgraded to a plain Tokio runtime.
pub fn from_env() -> Self {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also have an additional try_from_env that returns a Result to handle in case something goes wrong.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to this, the issue currently says

When configuration is invalid, debug builds panic and release builds fallback to disabled

But I don't know if there is much value on having a different behavior in debug for this function, so I skipped that. Opinion?

@noxware noxware marked this pull request as ready for review May 16, 2026 02:53
@noxware noxware requested a review from rcoh May 16, 2026 02:53
Copy link
Copy Markdown
Contributor

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look good! two small comments

Comment thread dial9-tokio-telemetry/src/config.rs Outdated
/// | --- | --- | --- |
/// | `DIAL9_S3_BUCKET` | unset | Upload sealed trace segments to this bucket. |
/// | `DIAL9_SERVICE_NAME` | binary name | Service name used in S3 keys and metadata. |
/// | `DIAL9_S3_PREFIX` | unset | Optional S3 object key prefix. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets default this to dial9-traces

Copy link
Copy Markdown
Contributor Author

@noxware noxware May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the change. Please note the following implications:

  • The default is handled here, so it's different than the builder's one. I suppose that's good for backwards compatibility, so this is not important.
  • My implementation reports all blank strings as invalid (to avoid hard to debug issues caused by "invisible" strings in configuration), so, with this change, there is no way to have "no prefix" through env vars. This is probably important. WDYT?

Comment thread dial9-tokio-telemetry/src/config.rs Outdated
@noxware
Copy link
Copy Markdown
Contributor Author

noxware commented May 18, 2026

An unrelated CI job seems to be failing due to a missing README_TELEMETRY.md, but the important ones are green.

@noxware noxware requested a review from rcoh May 18, 2026 21:13
@noxware
Copy link
Copy Markdown
Contributor Author

noxware commented May 18, 2026

@rcoh I noticed #350, which seems to propose a better default for max_file_size than the one I took from examples/production_use.rs. I think we should match that proposal here before merge.

Then, when that issue gets addressed in the builder API, we can just change ResolvedEnvConfig to hold None for max_file_size, delegating the default value handling to the programmatic API, keeping a single default source of truth. WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants