Skip to content

Conversation

erieaton-amd
Copy link
Contributor

This patch changes GPU to use ZoneEvents, to reduce code duplication and allow consistent feature support between GPU and CPU. The CPU is given a context similar to how GPUs have a context.

The following features now work with GPU events:
Trace Comparison
Zone Search
Zone Histogram
Flame Graph

Fixes #1109

I'm posting this for some feedback on the concept. There are still some major to do items for this:

  • Testing with different platforms and build configurations.
  • It should be possible to have the extra bytes in ZoneExtra only for GPU contexts. But this will need a whole lot of additional refactoring.
  • File format reverse compatibility is broken, but can probably be fixed.

Also, this rework could allow more than one CPU context, which could be used to support cluster tracing.

This patch changes GPU to use ZoneEvents, to reduce code duplication and
allow consistent feature support between GPU and CPU. The CPU is given a
context similar to how GPUs have a context.

The following features now work with GPU events:
Trace Comparison
Zone Search
Zone Histogram
Flame Graph

Fixes wolfpld#1109
@wolfpld
Copy link
Owner

wolfpld commented Sep 16, 2025

It would be helpful if you'd describe changes made in TracyWorker and TracyEvent (i.e., how the data storage is changed, why, how it's better; step by step).

@erieaton-amd
Copy link
Contributor Author

erieaton-amd commented Sep 16, 2025

The GpuEvent is removed and replaced with ZoneEvent. The extra fields of GpuEvent (the two extra timestamps, and query_id) are added to ZoneExtra. (This of course uses up extra bytes with CPU events, but that could be fixed I think). Replacing the GpuEvent with ZoneEvent allows a bunch of nearly identical functions to be consolidated.

TracyWorker.cpp contains some parallel data structures for CPU and GPU, like sourceLocationZones and gpuSourceLocationZones. These data structures are moved to a new ZoneContext type, which is a base class for CPU and GPU. GpuCtxData is made a subclass of ZoneContext, and a new context type CPUZoneContext is added for CPU. Worker creates a default CPU context automatically. GpuSourceLocationZones and GpuZoneThreadData are redundant with non-gpu and are removed. The ZoneThreadData and SourceLocationZones are moved to TracyContext.hpp. Both contexts use the same ThreadData type now, but CPU has a subclass to store extra information for sampling data. The m_threadMap field serves the role that GpuCtxData::threadData does for CPU, so that got moved to the ZoneContext, along with m_data.threads, threadCtx, threadCtxData.

Zone child data is still held by Worker. There is an assumption that the same allocation of ZoneEvent would never be in two contexts, so it should work. Also, all contexts share the original zoneExtra data storage in Worker. If that data structure were moved to contexts, the CPU version could avoid the extra GPU fields.

Most of the remaining changes are refactoring to deal with the above changes.

The UI is changed in some places to have a Combo box to select a context, and display the context with search results.

There is no change to the protocol, but there is to the file format.

@erieaton-amd
Copy link
Contributor Author

I'm looking into creating a revision that only replaces GpuEvent with ZoneEvent and doesn't introduce a CPU context. I don't know yet if it will work.

@erieaton-amd
Copy link
Contributor Author

I'm looking into creating a revision that only replaces GpuEvent with ZoneEvent and doesn't introduce a CPU context. I don't know yet if it will work.

In order to to finish this, I have to engineer a bunch of alternate solutions to things I already solved, not sure if it's worth it.

@slomp
Copy link
Contributor

slomp commented Sep 29, 2025

There's a lot to digest here, so let me see if I understand.

The core idea here is to make GPU zones/events behave more like CPU zones/events (more like a full unification), and as such unlock zone utilities (search, compare, histograms, etc) for GPU zones? If so, I think this should be introduced/merged more incrementally (for example, leave changes around ZoneContextType out for now).

(Also: I think some of the work here can help with this issue/request: #1154)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CPU and GPU zone inconsistency

3 participants