Skip to content

Trim event duration based on tree hierarchy in CallGraph lib#274

Closed
hjli-creator wants to merge 1 commit intofacebookresearch:mainfrom
hjli-creator:export-D78594843
Closed

Trim event duration based on tree hierarchy in CallGraph lib#274
hjli-creator wants to merge 1 commit intofacebookresearch:mainfrom
hjli-creator:export-D78594843

Conversation

@hjli-creator
Copy link
Contributor

Summary:

What?

This commit improves CallStackGraph lib in HTA package to handle edge case from Kineto trace data: In traces with Python call stack info, when child Python function doesn't end before parent, it disturbs the assumed time order of events which is very important when constructing call stack graph. Because we know a child function cannot end after its parent, if it is found in the trace it is basically wrong information and we can preprocess the data by truncating the event duration. We can rely on Python event argument "Python id" and "Python parent id" to detect parent-child, then it is just a BFS tree traversal, and for each child, make sure its end time doesn't exceed its parent.

Why?

Incorrect call stack tree structure impacts the consumers of CallStackGraph, for example accuracy of Python overhead detection, because it depends on the leaf event interval. We have seen a few traces with unexpected high overhead that turned out incorrect. The CPA (Critical Path Analysis) also depends on CallStackGraph.

How?

  1. First we need to add "Python id" and "Python parent id" into the trace DataFrame.
  2. Then we introduce a boolean flag "preprocess_trace_data", if true then we run the truncating algorithm before constructing CallStackGraph.
  3. The algorithm is just BFS, first scan the DataFrame to create parent child edges, then BFS and truncate each child event's duration if needed. Note if a parent function invokes a child in different thread, child's duration doesn't need to be truncated so it is treated as a root event.

Differential Revision: D78594843

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 18, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D78594843

hjli-creator added a commit to hjli-creator/HolisticTraceAnalysis that referenced this pull request Jul 19, 2025
…kresearch#274)

Summary:

### What?

This commit improves CallStackGraph lib in HTA package to handle edge case from Kineto trace data: In traces with Python call stack info, when child Python function doesn't end before parent, it disturbs the assumed time order of events which is very important when constructing call stack graph. Because we know a child function cannot end after its parent, if it is found in the trace it is basically wrong information and we can preprocess the data by truncating the event duration. We can rely on Python event argument "Python id" and "Python parent id" to detect parent-child, then it is just a BFS tree traversal, and for each child, make sure its end time doesn't exceed its parent.

### Why?

Incorrect call stack tree structure impacts the consumers of CallStackGraph, for example accuracy of Python overhead detection, because it depends on the leaf event interval. We have seen a few traces with unexpected high overhead that turned out incorrect. The CPA (Critical Path Analysis) also depends on CallStackGraph.

### How?

1. First we need to add "Python id" and "Python parent id" into the trace DataFrame.
1. Then we introduce a boolean flag "preprocess_trace_data", if true then we run the truncating algorithm before constructing CallStackGraph.
1. The algorithm is just BFS, first scan the DataFrame to create parent child edges, then BFS and truncate each child event's duration if needed.  Note if a parent function invokes a child in different thread, child's duration doesn't need to be truncated so it is treated as a root event.

Differential Revision: D78594843
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D78594843

…kresearch#274)

Summary:

### What?

This commit improves CallStackGraph lib in HTA package to handle edge case from Kineto trace data: In traces with Python call stack info, when child Python function doesn't end before parent, it disturbs the assumed time order of events which is very important when constructing call stack graph. Because we know a child function cannot end after its parent, if it is found in the trace it is basically wrong information and we can preprocess the data by truncating the event duration. We can rely on Python event argument "Python id" and "Python parent id" to detect parent-child, then it is just a BFS tree traversal, and for each child, make sure its end time doesn't exceed its parent.

### Why?

Incorrect call stack tree structure impacts the consumers of CallStackGraph, for example accuracy of Python overhead detection, because it depends on the leaf event interval. We have seen a few traces with unexpected high overhead that turned out incorrect. The CPA (Critical Path Analysis) also depends on CallStackGraph.

### How?

1. First we need to add "Python id" and "Python parent id" into the trace DataFrame.
1. Then we introduce a boolean flag "preprocess_trace_data", if true then we run the truncating algorithm before constructing CallStackGraph.
1. The algorithm is just BFS, first scan the DataFrame to create parent child edges, then BFS and truncate each child event's duration if needed.  Note if a parent function invokes a child in different thread, child's duration doesn't need to be truncated so it is treated as a root event.

Differential Revision: D78594843
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D78594843

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 3011e1b.

DaylonSrinivasan pushed a commit to DaylonSrinivasan/HolisticTraceAnalysis that referenced this pull request Jul 29, 2025
Summary:

Reverts facebookresearch#274, which caused a missing key error: "index::python_id"

Reviewed By: hjli-creator

Differential Revision: D79185506
DaylonSrinivasan pushed a commit to DaylonSrinivasan/HolisticTraceAnalysis that referenced this pull request Jul 29, 2025
Summary:

Reverts facebookresearch#274, which caused a missing key error: "index::python_id"

Reviewed By: hjli-creator

Differential Revision: D79185506
facebook-github-bot pushed a commit that referenced this pull request Jul 29, 2025
Summary:
Pull Request resolved: #287

Reverts #274, which caused a missing key error: "index::python_id"

Reviewed By: hjli-creator

Differential Revision: D79185506

fbshipit-source-id: eba1a18123ff033dff864fef6aa5c26c8022fca0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants