Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions dial9-tokio-telemetry/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ Without this flag, compilation will fail with errors about missing methods on `t

There are two ways to set up dial9: the `#[main]` macro (recommended for most apps) or manual `TracedRuntime` setup. The macro handles the boilerplate of building the runtime and spawning your code as an instrumented task. Inside your `main` body, call `TelemetryHandle::current()` to get the handle for wake-event tracking. Use manual setup when you have multiple Tokio runtimes, don't own `main` (e.g., library code or embedded services), or need to integrate with existing runtime-building code.

If you are integrating dial9 into a production service, see the [dial9-in-production example](https://github.com/dial9-rs/dial9-tokio-telemetry/blob/main/dial9-tokio-telemetry/examples/production_use.rs).

### Using the `#[main]` macro

> **Note:** `#[dial9_tokio_telemetry::main]` is a **replacement** for `#[tokio::main]`, not a complement — do not use both on the same function. The macro builds and configures the Tokio runtime internally.
Expand Down Expand Up @@ -113,14 +115,14 @@ Compared to [tokio-console](https://github.com/tokio-rs/console), which is desig

## What gets recorded automatically

`TracedRuntime` installs hooks on the Tokio runtime. The following events are recorded out of the box:
`TracedRuntime` installs hooks on the Tokio runtime. The following events are always recorded:

| Event | Fields |
| -------------------------------- | --------------------------------------------------------------------- |
| `PollStart` / `PollEnd` | timestamp, worker, task ID, spawn location, local queue depth |
| `WorkerPark` / `WorkerUnpark` | timestamp, worker, local queue depth, thread CPU time, schedstat wait |
| `QueueSample` | timestamp, global queue depth (sampled every 10 ms) |
| `TaskSpawn` / `SpawnLocationDef` | task→spawn-location mapping (when `task_tracking` is enabled) |
| `TaskSpawn` | task→spawn-location mapping (when `task_tracking` is enabled) |

## The root future is not instrumented

Expand Down Expand Up @@ -160,9 +162,9 @@ runtime.block_on(async {
});
```

## Wake event tracking
## Wake event tracking (opt-in)

To understand when Tokio itself is delaying your code (scheduler delay), you need to know when your future was _ready_ to run. Wake eventswhich task woke which other task — are _not_ captured automatically. Tokio's runtime hooks don't currently allow instrumenting wakes: capturing wakes requires wrapping the future. The simplest way is to use `handle.spawn` instead of `tokio::spawn`.
To understand when Tokio itself is delaying your code (scheduler delay), you need to know when your future was _ready_ to run. Wake events, which fire at the moment your future is ready to be polled, are _not_ captured automatically. Tokio's runtime hooks don't currently allow instrumenting wakes: capturing wakes requires wrapping the future. The simplest way is to use `handle.spawn` instead of `tokio::spawn`.

Use `handle.spawn()` instead of `tokio::spawn()`:

Expand Down
340 changes: 340 additions & 0 deletions dial9-tokio-telemetry/examples/production_use.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,340 @@
//! This example discusses things to consider when using and setting up dial9 in production
//!
//! It also provides an opinionated set of knobs and environment variables to configure your application.
//!
//! Many applications can run with dial9 enabled all the time. Some applications will have worse performance, especially applications that perform
//! a very small amount of "useful work" per poll. Applications that use an extremely large number of worker threads will experience higher memory usage.
//! For an average production service, Dial9 produces 50-100GB / day.
//!
//! Since dial9 is recording an event on each poll (~50ns), if your polls are very short then this fix cost of overhead will impact your application performance.
//!
//! ### Enabling and Disabling
//! - If you use `Dial9ConfigBuilder::disabled()` dial9 is fully unused in your application and you have a normal, unmodified Tokio runtime.
//! - You can also install dial9 but leave it disabled. This has a slightly higher amount of overhead. All methods that would write dial9 data do a relaxed read on an atomic. This _will_
//! install the runtime hooks but they will all be no-ops. You could then set up a background task that reads dynamic configuration to enable dial9 later. This is a much larger surface area of code
//! that is enabled, so it is higher risk.
//!
//! > Note! dial9 must be created _before_ your async runtime. dial9 relies on installing itself into the runtime telemetry hooks to produce Tokio events.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention that using #[dial9::main] gets you this for free?

//!
//! ### The overhead of running dial9
//!
//! 1. Dial9 allocates a 1MB buffer for each thread that record events. If you are recording events from a huge number of threads, this can bloat memory.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this risk getting stale if we state the specific size? Should we be more vague?

//! 2. When Tokio telemetry is enabled, 2 dial9 events will be emitted by every poll. If your poll times are extremely short and your application is CPU bound then this overhead can be significant.
//!
//! Dial9 has many possible components you can enable. The more components, the more data you will produce and the more overhead your application will have.
//!
//! #### CPU Profiling
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would have expected some mention of permissions here (the implications of the available paranoid settings needed to access these)

//! Dial9 has two types of CPU profiling available:
//! 1. CPU profiling (this is what you would normally consider CPU profiling): Dial9 is sampling stack traces that are running on the CPU.
//! 2. Schedule Profiling: dial9 subscribes to the sched-switch linux kernel event and can capture a stack trace when your application is descheduled by the kernel. In order
//! to subscribe to these events, dial9 must open one perf fd per worker thread.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting seems off here

//!
//! #### Tokio Telemetry
//! The basic Tokio telemetry will install a few hooks:
//! 1. `on_before_poll` and `on_after_poll` callbacks: This have a nanosecond level of overhead on each poll, just from the dynamic dispatch.
//! 2. `on_worker_park/unpark`: When these events happen, dial9 reads a few kernel APIs (~1us) to help to understand how Tokio is interacting with the OS.
//!
//! When you use dial9's spawn method, your futures are wrapped in a future that tracks `wake` events. This is done by instrumenting the waker. This is optional but
//! allows you to understand scheduling delay of the tasks you are running.
//!
//! #### Tracing
//! Dial9 can capture Tracing spans via the `TracingLayer`. On the scale of tracing, this is fairly low overhead, however, if you have a large amount of deeply nested spans, this can produce a huge amount
//! of data. We recommend using a very fine-grained filter.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be worth including a sample filter (perhaps filtering aws sdk stuff), since we don't have the telemetry in the example?

//!
//! The rest of this example shows an opinionated way to wire dial9
//! so it can be enabled and tuned with environment variables. The
//! same binary can then run in dev, staging, and prod with different
//! tracing behavior, and tooling (CDK, Docker, k8s, etc.) can flip knobs
//! without a rebuild.
//!
//! # Environment variables
//!
//! | Name | Default | Meaning |
//! | --------------------------------- | ------------------------------- | ------------------------------------------------------------- |
//! | `DIAL9_ENABLED` | `false` | Master switch. `true`/`false` only (Rust `bool::from_str`). |
//! | `DIAL9_TRACE_DIR` | `/tmp/dial9-traces` | Directory to write rotated trace segments into. |
//! | `DIAL9_ROTATION_SECS` | `60` | Wall-clock rotation period in seconds. |
//! | `DIAL9_MAX_DISK_USAGE_MB` | `1024` | Upper bound on total on-disk bytes (old files evicted). |
//! | `DIAL9_S3_BUCKET` | unset / empty | When set, sealed segments are gzip-uploaded to this bucket. |
//! | `DIAL9_SERVICE_NAME` | binary name | Service name used in the S3 key layout (required with S3). |
//! | `DIAL9_CPU_PROFILE_ENABLED` | `true` on Linux, `false` else | Enables `perf_event_open`-based CPU sampling. |
//! | `DIAL9_CPU_SAMPLE_HZ` | `99` | Sampling frequency for CPU profiling. |
//! | `DIAL9_SCHEDULE_PROFILE_ENABLED` | `true` on Linux, `false` else | Enables per-worker scheduler event capture (context switches).|
//!
//! # Invalid configuration
//!
//! Parsing is performed once, eagerly, before the runtime is built. On
//! invalid input (unknown boolean, non-numeric duration, etc.) debug builds
//! **panic** so the mistake is loud during development, while release
//! builds log a warning and fall back to a plain Tokio runtime with
//! telemetry disabled — a bad trace config must never take down prod.
//!
//! # Running the example
//!
//! ```sh
//! # plain run: telemetry disabled
//! cargo run --example production_use
//!
//! # basic local tracing
//! DIAL9_ENABLED=true cargo run --example production_use
//!
//! # with CPU profiling + schedule events (Linux, requires feature flag)
//! DIAL9_ENABLED=true \
//! cargo run --features cpu-profiling --example production_use
//!
//! # with S3 upload (requires feature flag and AWS creds in env)
//! DIAL9_ENABLED=true \
//! DIAL9_S3_BUCKET=my-trace-bucket \
//! DIAL9_SERVICE_NAME=my-service \
//! cargo run --features worker-s3 --example production_use
//! ```

use std::time::Duration;

use dial9_tokio_telemetry::config::{Dial9Config, Dial9ConfigBuilder};
use dial9_tokio_telemetry::telemetry::TelemetryHandle;

/// Opinionated configuration for a production dial9 deployment.
#[derive(Debug)]
struct Dial9Opts {
enabled: bool,
trace_dir: String,
rotation: Duration,
max_disk_usage_bytes: u64,
s3_bucket: Option<String>,
service_name: Option<String>,
cpu_profile_enabled: bool,
cpu_sample_hz: u64,
schedule_profile_enabled: bool,
}

impl Default for Dial9Opts {
/// Safe, inert defaults. `enabled = false` so this is the release-mode
/// fallback when env parsing fails.
fn default() -> Self {
Self {
enabled: false,
trace_dir: "/tmp/dial9-traces".into(),
rotation: Duration::from_secs(60),
max_disk_usage_bytes: 1024 * 1024 * 1024,
s3_bucket: None,
service_name: None,
cpu_profile_enabled: false,
cpu_sample_hz: 99,
schedule_profile_enabled: false,
}
}
}

/// Reasons we might reject an environment configuration.
#[derive(Debug)]
enum Dial9EnvError {
Parse {
var: &'static str,
value: String,
reason: String,
},
Conflict(String),
}

impl std::fmt::Display for Dial9EnvError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::Parse { var, value, reason } => write!(f, "invalid {var}={value:?}: {reason}"),
Self::Conflict(msg) => f.write_str(msg),
}
}
}

impl std::error::Error for Dial9EnvError {}

impl Dial9Opts {
/// Parse environment variables into a [`Dial9Opts`].
///
/// Pure — no side effects, no panics, no logging.
fn from_env() -> Result<Self, Dial9EnvError> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sheer volume of parsing code makes this example more noisy than it needs to be. In practice I would probably use Clap with its envsupport for this... Pretty sure even the linux thing can be supported inline with default_value_t, else a simple enough option + unwrap_or.

WDYT about cutting over to stay focused on the business logic and opinions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that seems reasonable. Or we can just "leave it as an exercise to the reader"

I'm thinking this would actually work nicely as a _guide module instead of an example

let s3_bucket = non_empty_env("DIAL9_S3_BUCKET");
let service_name = non_empty_env("DIAL9_SERVICE_NAME");
if s3_bucket.is_some() && service_name.is_none() {
return Err(Dial9EnvError::Conflict(
"DIAL9_S3_BUCKET requires DIAL9_SERVICE_NAME to be set".into(),
));
}

// CPU profiling and scheduler events are Linux-only in the underlying
// crate — default to enabled there, disabled elsewhere, so an operator
// does not need a per-OS environment.
let linux = cfg!(target_os = "linux");

Ok(Self {
enabled: parse_env("DIAL9_ENABLED", false)?,
trace_dir: std::env::var("DIAL9_TRACE_DIR")
.unwrap_or_else(|_| "/tmp/dial9-traces".into()),
rotation: Duration::from_secs(parse_env("DIAL9_ROTATION_SECS", 60)?),
max_disk_usage_bytes: parse_env::<u64>("DIAL9_MAX_DISK_USAGE_MB", 1024)?
.saturating_mul(1024 * 1024),
s3_bucket,
service_name,
cpu_profile_enabled: parse_env("DIAL9_CPU_PROFILE_ENABLED", linux)?,
cpu_sample_hz: parse_env("DIAL9_CPU_SAMPLE_HZ", 99)?,
schedule_profile_enabled: parse_env("DIAL9_SCHEDULE_PROFILE_ENABLED", linux)?,
})
}

/// Parse the environment or fall back to a disabled configuration.
///
/// Debug builds panic on invalid input so mistakes are obvious; release
/// builds log a warning and fall back to disabled — a bad trace config
/// must never take down a running service.
fn from_env_or_fallback() -> Self {
match Self::from_env() {
Ok(opts) => opts,
Err(e) if cfg!(debug_assertions) => {
panic!("invalid dial9 env configuration: {e}")
}
Err(e) => {
eprintln!("warning: invalid dial9 env configuration: {e}; disabling telemetry");
Self::default()
}
}
}
}

/// `Some(value)` only when the variable is set *and* non-empty. `FOO=` in a
/// template file then behaves the same as `FOO` being absent.
fn non_empty_env(var: &str) -> Option<String> {
std::env::var(var).ok().filter(|v| !v.is_empty())
}

/// Parse an environment variable via [`FromStr`]. Missing or empty = default.
fn parse_env<T>(var: &'static str, default: T) -> Result<T, Dial9EnvError>
where
T: std::str::FromStr,
T::Err: std::fmt::Display,
{
match std::env::var(var) {
Err(_) => Ok(default),
Ok(raw) if raw.is_empty() => Ok(default),
Ok(raw) => raw.parse::<T>().map_err(|e| Dial9EnvError::Parse {
var,
value: raw,
reason: e.to_string(),
}),
}
}

/// Translate parsed options into a [`Dial9Config`] the `#[main]` macro can consume.
fn configure_dial9(opts: &Dial9Opts) -> Dial9Config {
if !opts.enabled {
// Plain tokio runtime, no telemetry threads or hooks.
return Dial9ConfigBuilder::disabled().build();
}

// Make sure the trace directory exists; the writer errors out otherwise.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed just because build() will panic on failure currently?

if let Err(e) = std::fs::create_dir_all(&opts.trace_dir) {
eprintln!("warning: could not create {}: {e}", opts.trace_dir);
}

let base_path = format!("{}/trace.bin", opts.trace_dir.trim_end_matches('/'));

let max_file_size = (opts.max_disk_usage_bytes / 4).max(16 * 1024 * 1024);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking forward to getting rid of this


let builder = Dial9ConfigBuilder::new(base_path, max_file_size, opts.max_disk_usage_bytes)
.rotation_period(opts.rotation)
.with_runtime(|r| r.with_task_tracking(true));

apply_s3(apply_cpu_profiling(builder, opts), opts).build()
}

/// CPU profiling is only compiled in with `--features cpu-profiling`. Without
/// it, the relevant env vars are ignored.
#[cfg(feature = "cpu-profiling")]
fn apply_cpu_profiling(builder: Dial9ConfigBuilder, opts: &Dial9Opts) -> Dial9ConfigBuilder {
use dial9_tokio_telemetry::telemetry::cpu_profile::{CpuProfilingConfig, SchedEventConfig};
let (cpu, sched, hz) = (
opts.cpu_profile_enabled,
opts.schedule_profile_enabled,
opts.cpu_sample_hz,
);
builder.with_runtime(move |mut r| {
if cpu {
r = r.with_cpu_profiling(CpuProfilingConfig::default().frequency_hz(hz));
}
if sched {
r = r.with_sched_events(SchedEventConfig::default());
}
r
})
}

#[cfg(not(feature = "cpu-profiling"))]
fn apply_cpu_profiling(builder: Dial9ConfigBuilder, opts: &Dial9Opts) -> Dial9ConfigBuilder {
if opts.cpu_profile_enabled || opts.schedule_profile_enabled {
eprintln!(
"warning: cpu/schedule profiling requested but --features cpu-profiling not enabled; ignoring"
);
}
builder
}

#[cfg(feature = "worker-s3")]
fn apply_s3(builder: Dial9ConfigBuilder, opts: &Dial9Opts) -> Dial9ConfigBuilder {
use dial9_tokio_telemetry::background_task::s3::S3Config;
let (Some(bucket), Some(service_name)) = (opts.s3_bucket.clone(), opts.service_name.clone())
else {
return builder;
};
let s3 = S3Config::builder()
.bucket(bucket)
.service_name(service_name)
.build();
builder.with_runtime(|r| r.with_s3_uploader(s3))
}

#[cfg(not(feature = "worker-s3"))]
fn apply_s3(builder: Dial9ConfigBuilder, opts: &Dial9Opts) -> Dial9ConfigBuilder {
if opts.s3_bucket.is_some() {
eprintln!(
"warning: DIAL9_S3_BUCKET set but --features worker-s3 not enabled; traces only on local disk"
);
}
builder
}

fn my_config() -> Dial9Config {
let opts = Dial9Opts::from_env_or_fallback();
eprintln!(
"dial9 telemetry: {}",
if opts.enabled {
format!("enabled, writing to {}", opts.trace_dir)
} else {
"disabled (set DIAL9_ENABLED=true to enable)".into()
}
);
configure_dial9(&opts)
}

async fn workload_task(id: usize) {
for i in 0..5 {
tokio::time::sleep(Duration::from_millis(5)).await;
if i == 0 && id.is_multiple_of(25) {
println!("task {id} working");
}
}
}

#[dial9_tokio_telemetry::main(config = my_config)]
async fn main() {
// `TelemetryHandle::try_current` returns `None` when telemetry is disabled,
// so the same code path works in both modes.
let tasks: Vec<_> = (0..100)
.map(|i| match TelemetryHandle::try_current() {
Some(handle) => handle.spawn(workload_task(i)),
None => tokio::spawn(workload_task(i)),
})
.collect();
for task in tasks {
let _ = task.await;
}
println!("workload finished");
}
2 changes: 1 addition & 1 deletion dial9-viewer/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ const SETUP_SECTIONS: &[&str] = &[
"Quick start",
"The root future is not instrumented",
"Tracing span events (opt-in)",
"Wake event tracking",
"Wake event tracking (opt-in)",
];

/// Extract sections from the crate README and write them as `setup.md` in the
Expand Down
Loading