Skip to content

Add tests for rocprofiler-compute#2300

Merged
jbonnell-amd merged 90 commits intomainfrom
users/jbonnell-amd/rocprofiler-compute-tests
Feb 27, 2026
Merged

Add tests for rocprofiler-compute#2300
jbonnell-amd merged 90 commits intomainfrom
users/jbonnell-amd/rocprofiler-compute-tests

Conversation

@jbonnell-amd
Copy link
Copy Markdown
Contributor

@jbonnell-amd jbonnell-amd commented Nov 25, 2025

Motivation

Add test coverage for rocprofiler-compute component

Technical Details

  • Updated CMakeLists.txt for rocprofiler-compute to build and install tests
  • Updated artifact-rocprofiler-sdk.toml to include necessary test requirements for rocprofiler-compute under the test artifact for sdk
  • Updated fetch_test_configurations.py to include rocprofiler-compute, added requirements-files parameter to pass along to test_component.yml to install pip reqs (compute requires additional pip requirements to run)
  • Updated test_component.yml to run a setup step before the test step that will run install_requirements.py to install additional requirements.txt files as needed for tests
  • Updated cmake/therock_amdgpu_targets.cmake to exclude rocprofiler-compute for unsupported devices
  • Add new filetest_rocprofiler_compute.py to run ctest on compute tests

Test Plan

  • Ensure tests pass locally when running in the no_rocm Docker image and using the generated artifacts from this branch
  • Ensure tests pass in CI pipeline

JIRA ID

ROCM-324

Test Result

  • Tests pass locally and in the pipeline
    • Smoke tests take around 10 minutes to complete. I tried adding running them in parallel but that led to some flakiness on the profiling tests.
    • Full test suite run take 30-40 minutes
  • The test_profile_pc_sampling test has been excluded from running due to its constant failing right now due to a rocr issue (see Jira mentioned in comment in test_rocprofiler_compute.py), this can be added back once it is confirmed to be fixed

Submission Checklist

@ScottTodd ScottTodd removed their request for review February 24, 2026 19:09
@ScottTodd
Copy link
Copy Markdown
Member

This has enough reviewers other than me. I've already given some input.

@jbonnell-amd
Copy link
Copy Markdown
Contributor Author

jbonnell-amd commented Feb 24, 2026

There is currently an issue with test_num_xcds_cli_output which has been consistently failing recently, both in pipeline/local environments via TheRock and also from our own CI on rocm-systems.

I could exclude this test before merging if desired, otherwise it will be failing until the following PR gets merged and TheRock submodule gets updated to include the fix:

ROCm/rocm-systems#3507

@jbonnell-amd
Copy link
Copy Markdown
Contributor Author

Fix for test_num_xcds_cli_output has been merged to rocm-systems, so tests should be fine now once the submodule is updated.

@geomin12 @jayhawk-commits @ScottTodd we'd like to get this merged ASAP so we can disable PSDB runs for rocprofiler-compute which are blocking development time. Please let me know if there are any additional changes that should be made on this PR.

Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good overall, few housekeeping comments

Comment thread build_tools/install_additional_requirements.py Outdated
Comment thread .github/workflows/test_component.yml Outdated
Comment thread .github/workflows/test_component.yml
Comment thread build_tools/install_additional_requirements.py
Comment thread build_tools/github_actions/fetch_test_configurations.py Outdated
Comment thread build_tools/github_actions/fetch_test_configurations.py Outdated
Comment thread profiler/CMakeLists.txt Outdated
Comment thread .github/workflows/test_component.yml Outdated
Copy link
Copy Markdown
Contributor

@rahulc-gh rahulc-gh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If required tests are passing, looks good to me

Comment thread build_tools/install_additional_requirements.py
Comment thread build_tools/install_additional_requirements.py
Comment thread .github/workflows/test_component.yml
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for comments! i know there are some housekeeping things, but we can address in a later PR

lgtm assuming it passes in CI

@jbonnell-amd
Copy link
Copy Markdown
Contributor Author

CI is passing, the only failing test is test_num_xcds_cli_output which was already fixed in ROCm/rocm-systems#3507 but has yet to make it in the rocm-systems submodule in TheRock.

I will merge this for now but expect this one test to fail until the next rocm-systems submodule bump occurs in TheRock. I'm going to make a small follow-up housekeeping PR tomorrow anyways, so can follow up there as needed for any changes.

Thanks again!

@jbonnell-amd jbonnell-amd merged commit 5f60db7 into main Feb 27, 2026
95 of 97 checks passed
@jbonnell-amd jbonnell-amd deleted the users/jbonnell-amd/rocprofiler-compute-tests branch February 27, 2026 01:12
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Feb 27, 2026
chiranjeevipattigidi added a commit that referenced this pull request Mar 2, 2026
jbonnell-amd added a commit that referenced this pull request Mar 2, 2026
## Motivation

<!-- Explain the purpose of this PR and the goals it aims to achieve.
-->
gfx103X builds appear to be failing since #2300 got merged. This should
address these build errors and unblock gfx103X releases.

## Technical Details

<!-- Explain the changes along with any relevant GitHub links. -->
- rocprofiler-compute does not support gfx103X architectures, so we
added the project to the `EXCLUDE_TARGET_PROJECTS` list for the gfx103X
family in `therock_amdgpu_targets.cmake`
- It appears that `EXCLUDE_TARGET_PROJECTS` entries for
rocprofiler-compute gfx1031, gfx1033, and gfx1034 were missing
- This was due to #1629 being merged two weeks ago, and the changes
related to `therock_amdgpu_targets.cmake` in the #2300 PR happening
before this new inclusion
- Failures were not found before merging due to PR not running gfx103X
builds

## Test Plan

<!-- Explain any relevant testing done to verify this PR. -->
Ensure that gfx103X builds are able to pass without any errors from
rocprofiler-compute

## Test Result

<!-- Briefly summarize test outcomes. -->
- Ran a manual gfx103X workflow with these changes:
https://github.com/ROCm/TheRock/actions/runs/22497338111/job/65175257319

## Submission Checklist

- [X] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants