Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
8ce2194
Fix JAX plugin wheel build
charleshofer Aug 4, 2025
57fdf68
Add AWS CLI install
charleshofer Aug 7, 2025
133efbf
Change AWS command path
charleshofer Aug 7, 2025
2cd0b5c
Fix wheelhouse path
charleshofer Aug 7, 2025
35283b3
Fix pip
charleshofer Aug 11, 2025
35b3b93
Add ensure pip step
charleshofer Aug 12, 2025
3840f6d
Add pip install via package manager
charleshofer Aug 19, 2025
5862e25
Switch to apt
charleshofer Aug 19, 2025
49853c7
Use python3 instead of python
charleshofer Aug 20, 2025
be4584e
Add setup step
charleshofer Aug 28, 2025
7393636
Fix amdgpu_family
charleshofer Aug 28, 2025
116b748
Fix needs
charleshofer Aug 28, 2025
cbe5af3
Add JAX build to nightly
charleshofer Sep 4, 2025
097aa8d
Fix nightly tarball path and address review comments:
charleshofer Sep 4, 2025
556be66
Fix python version
charleshofer Sep 4, 2025
1e7f0fb
Add suffix for tarball URL
charleshofer Sep 4, 2025
f0b81a6
Use AWS install script
charleshofer Sep 4, 2025
8a633c2
Fix typo in release workflow
charleshofer Sep 4, 2025
eee39f0
Remove region from URL
charleshofer Sep 4, 2025
d44c3f2
Fix ROCm version string for JAX build
charleshofer Sep 5, 2025
256a470
Update .github/workflows/release_portable_linux_packages.yml
charleshofer Sep 9, 2025
bc1911a
Encode URL for tar file
charleshofer Sep 11, 2025
d95caf7
Fix tar URL assignment
charleshofer Sep 16, 2025
6896c59
Fix url quoting
charleshofer Sep 18, 2025
b1fbd2c
Fix outputs JSON
charleshofer Sep 23, 2025
a6e2f3e
Remove quotes
charleshofer Sep 24, 2025
8aeb56a
Remove extra comma
charleshofer Sep 24, 2025
2b13c45
Fix tar url
charleshofer Sep 25, 2025
666bfd9
Use staging directory
charleshofer Sep 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 29 additions & 9 deletions .github/workflows/build_linux_jax_wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,18 @@ on:
workflow_dispatch:
inputs:
amdgpu_family:
required: true
type: string
type: choice
options:
- gfx110X-dgpu
- gfx1151
- gfx120X-all
- gfx94X-dcgpu
- gfx950-dcgpu
default: gfx94X-dcgpu
python_versions:
required: true
type: string
default:
default: "3.12"
release_type:
description: The type of release to build ("nightly", or "dev")
required: true
Expand Down Expand Up @@ -60,7 +66,7 @@ jobs:
name: Build Linux JAX Wheels | ${{ inputs.amdgpu_family }} | Python ${{ inputs.python_version }}
runs-on: ${{ github.repository_owner == 'ROCm' && 'azure-linux-scale-rocm' || 'ubuntu-24.04' }}
env:
PACKAGE_DIST_DIR: ${{ github.workspace }}/jax_rocm_plugin/wheelhouse
PACKAGE_DIST_DIR: ${{ github.workspace }}/jax/jax_rocm_plugin/wheelhouse
S3_BUCKET_PY: "therock-${{ inputs.release_type }}-python"
steps:
- name: Checkout TheRock
Expand All @@ -71,23 +77,35 @@ jobs:
with:
path: jax
repository: rocm/rocm-jax
ref: ${{ inputs.jax_ref }}
ref: ${{ matrix.jax_ref }}
Comment thread
geomin12 marked this conversation as resolved.

- name: Configure Git Identity
run: |
git config --global user.name "therockbot"
git config --global user.email "therockbot@amd.com"

- name: "Setting up Python"
uses: actions/setup-python@42375524e23c412d93fb67b49958b491fce71c38 # v5.4.0
with:
python-version: ${{ inputs.python_versions }}

- name: Build JAX Wheels
env:
ROCM_VERSION: ${{ inputs.rocm_version }}
run: |
pushd rocm-jax
ls -lah
pushd jax
python3 build/ci_build \
--compiler=clang \
--python-versions="${{ inputs.python_versions }}" \
--rocm-version="${{ inputs.rocm_version }}" \
--rocm-version="${ROCM_VERSION:0:5}" \
Comment on lines -87 to +101
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why truncate here?

Copy link
Copy Markdown
Contributor Author

@charleshofer charleshofer Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a lot of places, the build scripts and our ROCm setup scripts assume that the build number is always going to be of the form X.X.X. I tried this with the X.X.X+dev... version and the build failed.

--therock-path="${{ inputs.tar_url }}" \
dist_wheels

- name: Install AWS CLI
if: always()
run: bash ./dockerfiles/install_awscli.sh

- name: Configure AWS Credentials
if: always()
uses: aws-actions/configure-aws-credentials@7474bc4690e29a8392af63c5b98e7449536d5c3a # v4.3.1
Expand All @@ -104,5 +122,7 @@ jobs:
- name: (Re-)Generate Python package release index
if: ${{ github.repository_owner == 'ROCm' }}
run: |
pip install boto3 packaging
python ./build_tools/third_party/s3_management/manage.py ${{ inputs.s3_subdir }}/${{ inputs.amdgpu_family }}
python3 -m venv .venv
source .venv/bin/activate
pip3 install boto3 packaging
Comment thread
geomin12 marked this conversation as resolved.
python3 ./build_tools/third_party/s3_management/manage.py ${{ inputs.s3_subdir }}/${{ inputs.amdgpu_family }}
18 changes: 18 additions & 0 deletions .github/workflows/release_portable_linux_packages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,24 @@ jobs:
"rocm_version": "${{ needs.setup_metadata.outputs.version }}"
}

- name: URL-encode .tar URL
id: url-encode-tar
run: python -c "from urllib.parse import quote; print('tar_url=https://therock-${{ env.RELEASE_TYPE }}-tarball.s3.amazonaws.com/' + quote('therock-dist-linux-${{ matrix.target_bundle.amdgpu_family }}${{ inputs.package_suffix }}-${{ needs.setup_metadata.outputs.version }}.tar.gz'))" >> ${GITHUB_OUTPUT}

- name: Trigger build JAX wheels
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marbre I think this is the right place to stick the trigger for the nightly build. Is there a way to get the URL of the latest TheRock .tar release though? I didn't see a simple way to do it through this workflow.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base URL right now is always https://therock-nightly-tarball.s3.us-east-2.amazonaws.com/ (no CloudFront distribution planned). ROCm version is known by ${{ needs.setup_metadata.outputs.version }} and GPU family via ${{ matrix.target_bundle.amdgpu_family }}. Thus the URL should be

https://therock-nightly-tarball.s3.us-east-2.amazonaws.com/therock-dist-linux-${{ matrix.target_bundle.amdgpu_family }}-${{ needs.setup_metadata.outputs.version }}.tar.gz

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://github.com/ROCm/TheRock/blob/bb4372b9b502177915ade4d0d9f97397283212fd/.github/workflows/release_portable_linux_packages.yml#L117C53-L117C193 and maybe even better use

therock-dist-linux-${{ matrix.target_bundle.amdgpu_family }}${{ inputs.package_suffix }}-${{ needs.setup_metadata.outputs.version }}.tar.gz

for the filename to make it more robust.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand, that's the tarball that will sit on the worker's local filesystem, correct? I'm not super familiar with that specific workflow dispatch action, but is it guaranteed to run the wheel build workflow on the same runner as the workflow that called it? If not, we should keep using the URL

Copy link
Copy Markdown
Member

@marbre marbre Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I was only referring to the where the variable is composed. The tarbal behind is uploaded to S3 here: https://github.com/ROCm/TheRock/blob/bb4372b9b502177915ade4d0d9f97397283212fd/.github/workflows/release_portable_linux_packages.yml#L210

Wanted to point out that the file name can have an the package_suffix in it if set. This does not apply to scheduled builds by default but can be the case for manually triggered builds.

Copy link
Copy Markdown
Contributor Author

@charleshofer charleshofer Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, gotcha. Also, right now this points to

https://therock-nightly-tarball.s3.us-east-2.amazonaws.com/therock-dist-linux-${{ matrix.target_bundle.amdgpu_family }}${{ inputs.package_suffix }}-${{ needs.setup_metadata.outputs.version }}.tar.gz

Could we also have

https://therock-dev-tarball.s3.us-east-2.amazonaws.com/therock-dist-linux-${{ matrix.target_bundle.amdgpu_family }}${{ inputs.package_suffix }}-${{ needs.setup_metadata.outputs.version }}.tar.gz

for dev builds? (Notice the change in the first part of the URL)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used to encode this as build/release type variable so that can create either or URL.

if: ${{ github.repository_owner == 'ROCm' }}
uses: benc-uk/workflow-dispatch@e2e5e9a103e331dad343f381a29e654aea3cf8fc # v1.2.4
with:
workflow: build_linux_jax_wheels.yml
inputs: |
{ "amdgpu_family": "${{ matrix.target_bundle.amdgpu_family }}",
"python_versions": "3.12",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed Python version here? Should this use a matrix across versions somehow?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest release of the JAX plugin supports Python 3.10, 3.11, and 3.12 for the wheel build, but we only use 3.10 and 3.12 in our Ubuntu docker image builds. 3.10 support is going to be dropped next release, so I just left it as 3.12.

"release_type": "${{ env.RELEASE_TYPE }}",
"s3_subdir": "${{ env.S3_STAGING_SUBDIR }}",
"rocm_version": "${{ needs.setup_metadata.outputs.version }}",
"tar_url": "${{ steps.url-encode-tar.outputs.tar_url }}"
}

- name: Save cache
uses: actions/cache/save@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
if: always()
Expand Down
Loading