Skip to content

[DRAFT] "Stack" Tracing #355

Open
Open
@jswrenn

Description

@jswrenn

What problem are you trying to solve?

What is my program busy doing right now? For synchronous applications, this question can be answered by capturing stack traces from the running program. For asynchronous applications, this approach is unsatisfactory, since tasks that are idle — waiting to be rescheduled — will not appear.

How should the problem be solved?

Tokio Console could answer this question by displaying the causal tree of Spans that stems from each task in the task detail view. The precise user interface would draw inspiration from existing tooling for displaying process trees and span trees. At minimum, the UI should display an htop-like tree view of Span names, targets, and locations. My primary goal with this issue to is to settle on an implementation strategy.

Presently, the tracing-family of crates does not provide any mechanism for querying the consequences of a span. I've implemented this functionality in the tracing-causality library, which provides a tracing Layer that extends each Span in a Registry with a Set encoding that span's direct and indirect consequences. This Layer can then be queried by Span ID to get the full causal tree of consequences that stemmed from that span, plus a stream supplying subsequent, incremental updates to that causal tree.

I propose beginning by modifying the console-api crate to define a Consequences RPC service that permits clients to:

  • query the server for the consequences of a span, represented as an adjacency list of span IDs
  • receive incremental updates on subsequent changes to those consequences

Then, in console-subscriber:

  • install the tracing-causality layer
  • implement this service, using that layer.

Then, console:

  • upon entering the task detail view, request and display the causal tree of that task
  • listen for subsequent updates to the causal tree, and apply them

Unresolved questions:

  • The UI should display span names, targets, and locations, but the Consequences RPC service only provides Span IDs. Where should this additional metadata come from?

Any alternatives you've considered?

In the above proposal, the console-subscriber does the work of keeping just enough metadata around such that causality trees can be requested on-demand for any Span. This responsibility could, instead, fall on console, itself.

In this alternative approach, console-api would provide a service that streamed most tracing layer on_* events to clients. (Some of this functionality is already described in console-api, but not yet implemented in console-subscriber.) The console application would, then, listen to this stream and maintain just enough data to reconstruct causality trees on-demand.

We should seriously consider this option, as it:

  1. solves the aforementioned unresolved question of tracking span names, targets and locations
  2. would lay the groundwork for additional functionality, like:
    • distinguishing between idle/busy spans in the causal tree
    • showing the stream of events occurring within a task
  3. lays the groundwork for general-purpose remote subscribers

How would users interact with this feature?

This feature shall be implemented as a component within the task detail view. The precise user interface would draw inspiration from existing tooling for displaying process trees and span trees. At minimum, the UI should display an htop-like tree view of Span names, targets, and locations.

Would you like to work on this feature?

yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    S-featureSeverity: feature. This is adding a new feature.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions