Skip to content

Comments

Add support for ROCm 6.3#28220

Merged
jabraham17 merged 5 commits intochapel-lang:mainfrom
jabraham17:rocm63
Feb 18, 2026
Merged

Add support for ROCm 6.3#28220
jabraham17 merged 5 commits intochapel-lang:mainfrom
jabraham17:rocm63

Conversation

@jabraham17
Copy link
Member

@jabraham17 jabraham17 commented Dec 16, 2025

Adds support for using Chapel with ROCm 6.3.

This PR does not resolve the issue with halting described in #26934, rather it just assumes it to be an acceptable error message.

  • start_test test/gpu/native with CHPL_GPU=amd
  • start_test test/gpu/native with CHPL_GPU=amd and CHPL_COMM=ofi

Resolves #26934

I have opened #28415 to capture the desire for better ROCm crashes

[Reviewed by @e-kayrakli]

MIN_ROCM6_REQ_VERSION = "6.0"
MAX_ROCM6_REQ_VERSION = "6.3" # upper bound non-inclusive
MAX_ROCM6_REQ_VERSION_NICE = "6.2.x"
MAX_ROCM6_REQ_VERSION = "6.4" # upper bound non-inclusive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semi serious question: Should we drop upper bounds here and let people try whatever version they might want to? Looking at the diff here, the only real "fix" here is this bump it looks like. Are we unnecessarily creating work for us to bump this up for each ROCm version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not be that opposed to removing the upper bound. I think it is a nicer user experience that users can only use Chapel with known-to-work versions, rather than getting build/runtime errors. However, its a maintenance burden and can also frustrate users.

Maybe a good compromise is to turn it into a warning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the error into a warning, let me know what you think

Signed-off-by: Jade Abraham <jade.abraham@hpe.com>
Signed-off-by: Jade Abraham <jade.abraham@hpe.com>
Signed-off-by: Jade Abraham <jade.abraham@hpe.com>
Signed-off-by: Jade Abraham <jade.abraham@hpe.com>
Signed-off-by: Jade Abraham <jade.abraham@hpe.com>
Copy link
Contributor

@e-kayrakli e-kayrakli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to add testing for 6.3 specifically instead of bumping the version of the existing config? I am not against it at all, just curious.

@jabraham17
Copy link
Member Author

I am doing that because of the "degraded" state ROCm 6.3 is in with bad halts and I wanted to maintain testing for the old version

@jabraham17 jabraham17 merged commit 1ef06e5 into chapel-lang:main Feb 18, 2026
10 checks passed
@jabraham17 jabraham17 deleted the rocm63 branch February 18, 2026 18:10
@bradcray
Copy link
Member

Woohoo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ROCm 6.3 support

3 participants