Skip to content

Conversation

@ericcano
Copy link
Contributor

@ericcano ericcano commented Dec 9, 2025

PR description:

Splits the NVProfilerService into a generic framework activity annotation base and an NVTX specific customization. A ROCm customization is added. Another one for VTune should be easily derivable.

This commit also adds thread safety wit spinlocks to avoid thread scheduling effects. Some ranges experience double start of double end. This is now indicated with a mark, instead of failing on an assertion.

Support added for ES modules execution, path ranges, source and event cleanup (which is are the most important contributions to the inter event gap).

This could still be polished a lot, by adding configurable content or message, coloring by event/EDM stream, etc...

The support for EDM events is also not complete and could still be expanded.

PR validation:

The two services have been tested on NVIDIA and AMD GPUs and showed expected results:

Here with the PyTorch test:
image

image

This was also tested with HLT configurations.

Splits the NVProfilerService into a generic framework activity annotation base and an NVTX specific
customization. A ROCm customization is added. Another one for VTune should be easily derivable.

This commit also adds thread safety wit spinlocks to avoid thread scheduling effects.
Some ranges experience double start of double end. This is now indicated with a mark, instead
of failing on an assertion.
@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 9, 2025

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 9, 2025

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49580/47110

ERROR: Build errors found during clang-tidy run.

src/HeterogeneousCore/ROCmServices/plugins/ROCmProfilerService.cc:14:10: error: 'rocprofiler-sdk-roctx/roctx.h' file not found [clang-diagnostic-error]
   14 | #include <rocprofiler-sdk-roctx/roctx.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suppressed 326 warnings (326 in non-user code).
--
gmake: *** [config/SCRAM/GMake/Makefile.coderules:129: code-checks] Error 2
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 9, 2025

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49580/47111

ERROR: Build errors found during clang-tidy run.

src/HeterogeneousCore/ROCmServices/plugins/ROCmProfilerService.cc:14:10: error: 'rocprofiler-sdk-roctx/roctx.h' file not found [clang-diagnostic-error]
   14 | #include <rocprofiler-sdk-roctx/roctx.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suppressed 326 warnings (326 in non-user code).
--
gmake: *** [config/SCRAM/GMake/Makefile.coderules:129: code-checks] Error 2
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

@ericcano
Copy link
Contributor Author

This PR is dependent on cms-sw/cmsdist#10238

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Pull request #49580 was updated.

@cmsbuild
Copy link
Contributor

Milestone for this pull request has been moved to CMSSW_16_1_X. Please open a backport if it should also go in to CMSSW_16_0_X.

@cmsbuild cmsbuild modified the milestones: CMSSW_16_0_X, CMSSW_16_1_X Dec 18, 2025
@cmsbuild cmsbuild modified the milestones: CMSSW_16_0_X, CMSSW_16_1_X Dec 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants