Skip to content

Console doesn't detect self wakes from tokio::task::yield_now() after Tokio 1.23 #512

Open
@hds

Description

@hds

(edit(2024-01-24): Updated title and description to clarfiy that it is only self wakes coming from tokio::task::yield_now() that are affected)

What is the issue?

When observing an instrumented program built with Tokio 1.23 or later, Tokio Console isn't able to detect self wakes coming from tokio::task::yield_now().

This is due to tokio-rs/tokio#5223, where the self wake was changed to be deferred so that self waking tasks wouldn't starve the resource drivers.

How can the bug be reproduced?

The following code should show a task with a single self wake:

    _ = tokio::spawn(async {
        tokio::task::yield_now().await;
    })
    .await;

If build with Tokio 1.22.0 then we see the wake counted as a self wake in Tokio Console. We can confirm this using ari-subscriber where we get the output (truncated):

2024-01-23T16:24:05.167259Z TRACE runtime.spawn[1]{kind=task, task.name=, task.id=18, loc.file="src/main.rs", loc.line=15, loc.col=9} new
2024-01-23T16:24:05.167786Z TRACE runtime.spawn[1]{kind=task, task.name=, task.id=18, loc.file="src/main.rs", loc.line=15, loc.col=9} enter
2024-01-23T16:24:05.167911Z TRACE runtime.spawn[1]{kind=task, task.name=, task.id=18, loc.file="src/main.rs", loc.line=15, loc.col=9} tokio::task::waker: op="waker.wake_by_ref", task.id=1
2024-01-23T16:24:05.168007Z TRACE runtime.spawn[1]{kind=task, task.name=, task.id=18, loc.file="src/main.rs", loc.line=15, loc.col=9} exit

Specifically there is a wake operation (op="waker.wake_by_ref") inside the runtime.spawn span.

When updating to Tokio 1.23.0, Tokio Console no longer counts the single wake as a self wake. Again, we can confirm using ari-subscriber (truncated output):

2024-01-23T16:33:26.834206Z TRACE runtime.spawn[1]{kind=task, task.name=, task.id=18, loc.file="src/main.rs", loc.line=15, loc.col=9} new
2024-01-23T16:33:26.834805Z TRACE runtime.spawn[1]{kind=task, task.name=, task.id=18, loc.file="src/main.rs", loc.line=15, loc.col=9} enter
2024-01-23T16:33:26.834966Z TRACE runtime.spawn[1]{kind=task, task.name=, task.id=18, loc.file="src/main.rs", loc.line=15, loc.col=9} tokio::task::waker: op="waker.clone", task.id=1
2024-01-23T16:33:26.835048Z TRACE runtime.spawn[1]{kind=task, task.name=, task.id=18, loc.file="src/main.rs", loc.line=15, loc.col=9} exit
2024-01-23T16:33:26.835145Z TRACE tokio::task::waker: op="waker.wake", task.id=1

Here we see that the waker is cloned (op="waker.clone") within the runtime.spanw span for the task, but the wake itself happens by value after the span has exited (op="waker.wake").

Note: The task.id on the tokio::task::waker traces corresponds to the tracing span Id, which is shown for traces in brackets after the name, e.g. runtime.spawn[1].

Logs, error output, etc

No response

Versions

Tokio 1.23

Possible solution

We may need to modify the instrumentation in Tokio itself to be able to detect a self wake. Since Tokio knows that it is performing a self wake (the deferred waking code is only used from tokio::task::yield_now()), it may be possible to explicitly include in the instrumentation when a wake is a self wake.

It's not necessarily straight forward though, this code passes through the std library Waker object.

Additional context

No response

Would you like to work on fixing this bug?

yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    S-bugSeverity: bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions