Skip to content

Shift nightly builds from gfx103X-dgpu to gfx103X-all#3763

Merged
harkgill-amd merged 6 commits intomainfrom
users/harkgill/enable-gfx103X-all
Apr 21, 2026
Merged

Shift nightly builds from gfx103X-dgpu to gfx103X-all#3763
harkgill-amd merged 6 commits intomainfrom
users/harkgill/enable-gfx103X-all

Conversation

@harkgill-amd
Copy link
Copy Markdown
Contributor

@harkgill-amd harkgill-amd commented Mar 4, 2026

Motivation

Addresses #3404

Our current gfx103X-dgpu family misses out on gfx1033 , gfx1035 and gfx1036. Shifting the nightly builds to gfx103X-all which will cover the aforementioned architectures.

Test Plan

Build TheRock with -DTHEROCK_AMDGPU_FAMILIES=gfx103X-all on Linux and Windows

Test Result

  • Windows = Build Successful
  • Linux = Build Successful

Submission Checklist

@LuXuxue
Copy link
Copy Markdown

LuXuxue commented Mar 27, 2026

Is there any progress recently? ROCm/rocm-libraries#5141 has been approved but has not been merged.

@harkgill-amd
Copy link
Copy Markdown
Contributor Author

We'd need ROCm/rocm-libraries#5141 to be merged before we can continue forward with this change. The former is being blocked by unrelated CI failures which the CK team is working hard to resolve. Hopefully we can get it in by the end of this week.

illsilin added a commit to ROCm/rocm-libraries that referenced this pull request Apr 3, 2026
## Motivation

Resolving PyTorch build failures when enabling builds for gfx103X-all
family in TheRock. ROCm/TheRock#3763. `gfx1033`
is the only failing architecture in the family and the failures point to
missing support in CK.

## Technical Details

PyTorch build fails with repeated error message
```
/__w/TheRock/TheRock/external-builds/pytorch/pytorch/aten/src/ATen/../../../third_party/composable_kernel/include/ck/utility/amd_buffer_addressing_builtins.hpp:33:48: error: use of undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'
   33 |     wave_buffer_resource.config(Number<3>{}) = CK_BUFFER_RESOURCE_3RD_DWORD;
      |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
`gfx1033` is missing from the `__gfx103__` group which results in
`CK_BUFFER_RESOURCE_3RD_DWORD` never being defined for it. Adding in
`gfx1033` to the missing files which should be the minimum fix to allow
torch builds to pass.

## Test Plan

Compile sample test file and target gfx1033
```
...
#ifdef __HIP_DEVICE_COMPILE__
static_assert(CK_BUFFER_RESOURCE_3RD_DWORD == 0x31014000, "wrong device value");
#else
static_assert(CK_BUFFER_RESOURCE_3RD_DWORD == -1, "wrong host value");
#endif
```

## Test Result

Prior to the applying patch, compilation fails with `error: use of
undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'`

After applying patch, test file compiles successfully.

## Submission Checklist

- [X] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
assistant-librarian Bot pushed a commit to ROCm/composable_kernel that referenced this pull request Apr 3, 2026
Add missing gfx1033 to gfx103 group definition in ck

## Motivation

Resolving PyTorch build failures when enabling builds for gfx103X-all
family in TheRock. ROCm/TheRock#3763. `gfx1033`
is the only failing architecture in the family and the failures point to
missing support in CK.

## Technical Details

PyTorch build fails with repeated error message
```
/__w/TheRock/TheRock/external-builds/pytorch/pytorch/aten/src/ATen/../../../third_party/composable_kernel/include/ck/utility/amd_buffer_addressing_builtins.hpp:33:48: error: use of undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'
   33 |     wave_buffer_resource.config(Number<3>{}) = CK_BUFFER_RESOURCE_3RD_DWORD;
      |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
`gfx1033` is missing from the `__gfx103__` group which results in
`CK_BUFFER_RESOURCE_3RD_DWORD` never being defined for it. Adding in
`gfx1033` to the missing files which should be the minimum fix to allow
torch builds to pass.

## Test Plan

Compile sample test file and target gfx1033
```
...
#ifdef __HIP_DEVICE_COMPILE__
static_assert(CK_BUFFER_RESOURCE_3RD_DWORD == 0x31014000, "wrong device value");
#else
static_assert(CK_BUFFER_RESOURCE_3RD_DWORD == -1, "wrong host value");
#endif
```

## Test Result

Prior to the applying patch, compilation fails with `error: use of
undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'`

After applying patch, test file compiles successfully.

## Submission Checklist

- [X] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
@CarlGao4
Copy link
Copy Markdown

CarlGao4 commented Apr 4, 2026

Now it is merged!

@harkgill-amd harkgill-amd marked this pull request as ready for review April 6, 2026 14:05
@harkgill-amd harkgill-amd requested a review from geomin12 April 6, 2026 14:05
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but let's wait on CI checks

@LuXuxue
Copy link
Copy Markdown

LuXuxue commented Apr 9, 2026

Now it is merged!

Maybe we have to waiting for a rocm-libraries bump in this repositories before this pr, or will get endless error

@harkgill-amd
Copy link
Copy Markdown
Contributor Author

Maybe we have to waiting for a rocm-libraries bump in this repositories before this pr, or will get endless error

The below PRs got the changes into the release/2.X> branches which TheRock uses to build torch from.

Just waiting on ROCm/pytorch#3144 for release/2.11 as that was recently enabled as well. Once this is in, we shouldn't see any of the gfx1033/CK failures.

vidyasagar-amd pushed a commit to ROCm/rocm-libraries that referenced this pull request Apr 9, 2026
## Motivation

Resolving PyTorch build failures when enabling builds for gfx103X-all
family in TheRock. ROCm/TheRock#3763. `gfx1033`
is the only failing architecture in the family and the failures point to
missing support in CK.

## Technical Details

PyTorch build fails with repeated error message
```
/__w/TheRock/TheRock/external-builds/pytorch/pytorch/aten/src/ATen/../../../third_party/composable_kernel/include/ck/utility/amd_buffer_addressing_builtins.hpp:33:48: error: use of undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'
   33 |     wave_buffer_resource.config(Number<3>{}) = CK_BUFFER_RESOURCE_3RD_DWORD;
      |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
`gfx1033` is missing from the `__gfx103__` group which results in
`CK_BUFFER_RESOURCE_3RD_DWORD` never being defined for it. Adding in
`gfx1033` to the missing files which should be the minimum fix to allow
torch builds to pass.

## Test Plan

Compile sample test file and target gfx1033
```
...
#ifdef __HIP_DEVICE_COMPILE__
static_assert(CK_BUFFER_RESOURCE_3RD_DWORD == 0x31014000, "wrong device value");
#else
static_assert(CK_BUFFER_RESOURCE_3RD_DWORD == -1, "wrong host value");
#endif
```

## Test Result

Prior to the applying patch, compilation fails with `error: use of
undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'`

After applying patch, test file compiles successfully.

## Submission Checklist

- [X] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
hyoon1 pushed a commit to hyoon1/composable_kernel that referenced this pull request Apr 12, 2026
## Motivation

Resolving PyTorch build failures when enabling builds for gfx103X-all
family in TheRock. ROCm/TheRock#3763. `gfx1033`
is the only failing architecture in the family and the failures point to
missing support in CK.

## Technical Details

PyTorch build fails with repeated error message
```
/__w/TheRock/TheRock/external-builds/pytorch/pytorch/aten/src/ATen/../../../third_party/composable_kernel/include/ck/utility/amd_buffer_addressing_builtins.hpp:33:48: error: use of undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'
   33 |     wave_buffer_resource.config(Number<3>{}) = CK_BUFFER_RESOURCE_3RD_DWORD;
      |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
`gfx1033` is missing from the `__gfx103__` group which results in
`CK_BUFFER_RESOURCE_3RD_DWORD` never being defined for it. Adding in
`gfx1033` to the missing files which should be the minimum fix to allow
torch builds to pass.

## Test Plan

Compile sample test file and target gfx1033
```
...
#ifdef __HIP_DEVICE_COMPILE__
static_assert(CK_BUFFER_RESOURCE_3RD_DWORD == 0x31014000, "wrong device value");
#else
static_assert(CK_BUFFER_RESOURCE_3RD_DWORD == -1, "wrong host value");
#endif
```

## Test Result

Prior to the applying patch, compilation fails with `error: use of
undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'`

After applying patch, test file compiles successfully.

## Submission Checklist

- [X] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
lucbruni-amd added a commit that referenced this pull request Apr 13, 2026
## Motivation

Update `ROADMAP.md` to reflect recently added support.

## Technical Details

`gfx103X-all` builds passing for Linux/Windows:
#3763 (Pytorch failing until
ROCm/rocm-libraries#5141 lands)

`gfx900` builds passing: #3564

`gfx90c` builds awaiting ROCm/rocm-libraries#5282 to go green

## Test Plan

`gfx90c` builds to be tested
(#3818)

## Test Result

N/A

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
AaronStGeorge pushed a commit to AaronStGeorge/rocm-libraries that referenced this pull request Apr 16, 2026
## Motivation

Resolving PyTorch build failures when enabling builds for gfx103X-all
family in TheRock. ROCm/TheRock#3763. `gfx1033`
is the only failing architecture in the family and the failures point to
missing support in CK.

## Technical Details

PyTorch build fails with repeated error message
```
/__w/TheRock/TheRock/external-builds/pytorch/pytorch/aten/src/ATen/../../../third_party/composable_kernel/include/ck/utility/amd_buffer_addressing_builtins.hpp:33:48: error: use of undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'
   33 |     wave_buffer_resource.config(Number<3>{}) = CK_BUFFER_RESOURCE_3RD_DWORD;
      |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
`gfx1033` is missing from the `__gfx103__` group which results in
`CK_BUFFER_RESOURCE_3RD_DWORD` never being defined for it. Adding in
`gfx1033` to the missing files which should be the minimum fix to allow
torch builds to pass.

## Test Plan

Compile sample test file and target gfx1033
```
...
#ifdef __HIP_DEVICE_COMPILE__
static_assert(CK_BUFFER_RESOURCE_3RD_DWORD == 0x31014000, "wrong device value");
#else
static_assert(CK_BUFFER_RESOURCE_3RD_DWORD == -1, "wrong host value");
#endif
```

## Test Result

Prior to the applying patch, compilation fails with `error: use of
undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'`

After applying patch, test file compiles successfully.

## Submission Checklist

- [X] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, builds pass for both linux + windows

@harkgill-amd harkgill-amd merged commit 2722ebf into main Apr 21, 2026
130 of 138 checks passed
@harkgill-amd harkgill-amd deleted the users/harkgill/enable-gfx103X-all branch April 21, 2026 15:22
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants