Skip to content

feat!: Include CPU id in CPU profile samples#338

Merged
rcoh merged 3 commits into
mainfrom
cpu-id-in-samples
May 2, 2026
Merged

feat!: Include CPU id in CPU profile samples#338
rcoh merged 3 commits into
mainfrom
cpu-id-in-samples

Conversation

@rcoh
Copy link
Copy Markdown
Contributor

@rcoh rcoh commented May 1, 2026

Summary

Adds the CPU id that a CPU profile sample was collected on to CpuSampleEvent, so the viewer can show which physical CPU each sample came from.

Breaking changes

This is a breaking change to perf-self-profile which will be bumped to 0.4 in the next release

Changes

perf-self-profile (breaking):

  • Sample.cpu: u32Sample.cpu: Option<u32>. The perf backend already gets Option<u32> from the kernel via PERF_SAMPLE_CPU; previously we were unwrap_or(0)-ing away the missing case. The ctimer fallback now returns None when SYS_getcpu fails instead of silently reporting CPU 0 (ambiguous with real CPU 0).
  • Internal SampleData/DrainedSample/SlotWriter thread Option<u32> through the lock-free signal-handler-safe ring buffer.

dial9-tokio-telemetry:

  • CpuSampleData.cpu: Option<u32> in memory.
  • CpuSampleEvent.cpu: Option<u64> on the wire — widened to u64 so it encodes as OptionalVarint: 1 byte when absent, 2 bytes total for typical small CPU ids. Narrowed back to Option<u32> on decode.
  • TelemetryEvent::CpuSample gains cpu: Option<u32>. Older traces recorded before this field existed decode cleanly as cpu: None.
  • RawCpuSample.cpu threaded through CpuProfiler::drain and SchedProfiler::drain.

JS parser: CpuSample objects now expose cpu: number | null.

Tests

  • perf-self-profile/tests/multithread.rs: asserts every perf sample carries Some(cpu) (Linux-only).
  • dial9-tokio-telemetry/src/telemetry/buffer.rs: two round-trip tests for CpuSample — one with cpu: Some(7), one with cpu: None — encode through ThreadLocalBuffer and decode via decode_events.

All 422 tests pass locally, 2 stress iterations clean. cargo fmt --check and cargo clippy --all-targets --all-features clean.

Drive-by

Narrow cfg(test) on SchedStat.fd to cfg(all(test, target_os = "linux")) to match its only reader, fixing a pre-existing dead_code warning on non-Linux builds.

Follow-ups

  • Demo trace (dial9-viewer/ui/demo-trace.bin) not yet regenerated — will do in a follow-up commit on this branch before merge.
  • A separate PR will capture CPU id on WorkerUnpark so we can see scheduler migrations between park and unpark.

rcoh added 2 commits May 1, 2026 11:47
perf-self-profile (breaking change):
- Sample.cpu is now Option<u32> rather than u32. Perf sampling always
  returns Some(cpu) via PERF_SAMPLE_CPU; the ctimer fallback returns
  None when SYS_getcpu fails instead of silently reporting cpu 0.
- SampleData/DrainedSample/SlotWriter::write in the lock-free ring
  buffer thread Option<u32> through the signal-handler-safe path.

dial9-tokio-telemetry:
- CpuSampleData.cpu: Option<u32> (in-memory).
- CpuSampleEvent.cpu: Option<u64> on the wire. Widened to u64 so the
  field encodes as OptionalVarint: 1 byte when absent, 2 bytes total
  for typical small CPU ids. Narrowed back to Option<u32> on decode.
- TelemetryEvent::CpuSample gains cpu: Option<u32>. Older traces
  without the field decode as None (forward-compatible).
- RawCpuSample.cpu threaded through CpuProfiler::drain and
  SchedProfiler::drain.

JS parser: CpuSample objects now expose cpu: number|null.

Tests:
- perf-self-profile: multithread.rs asserts every perf sample carries
  Some(cpu) (Linux-only).
- dial9-tokio-telemetry: two round-trip tests in buffer.rs exercising
  encode+decode with cpu=Some(7) and cpu=None through ThreadLocalBuffer.

Drive-by: narrow cfg(test) on SchedStat.fd to cfg(all(test, target_os
= "linux")) to match its only reader, fixing a dead_code warning on
non-Linux builds.

Demo trace not yet regenerated (pending Linux environment).
Verified 7411 / 7411 CpuSample events carry a cpu id (10 distinct CPU
ids observed: 0-9). Generated via scripts/regenerate_demo_trace_docker.sh
on the aarch64 docker builder.
@rcoh rcoh changed the title feat: Include CPU id in CPU profile samples feat!: Include CPU id in CPU profile samples May 1, 2026
@rcoh rcoh requested a review from jlizen May 1, 2026 21:40
Copy link
Copy Markdown
Member

@jlizen jlizen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ to the sentinel -> Option, good breaking change

pub callchain: InternedStackFrames,
/// CPU the sample was taken on, if the backend could determine it.
///
/// Widened to `u64` on the wire so the field encodes as `OptionalVarint`:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good approach

.expect("stack pool entry must exist for CpuSample callchain")
.to_vec(),
// CPU id is varint-encoded as u64 on the wire; real CPU ids fit in u32.
cpu: e.cpu.map(|v| v as u32),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: better to be clearer about bad wire with u32::try_from()::ok

@rcoh rcoh enabled auto-merge May 2, 2026 00:42
@rcoh rcoh added this pull request to the merge queue May 2, 2026
Merged via the queue into main with commit b956998 May 2, 2026
18 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants