Skip to content

feat: upgrade to ROCm 7.2.2 / PyTorch 2.10.0 with native RDNA4 support#6

Open
brosequist wants to merge 2 commits intocorundex:mainfrom
brosequist:rocm-7.2.2-pytorch-2.10.0
Open

feat: upgrade to ROCm 7.2.2 / PyTorch 2.10.0 with native RDNA4 support#6
brosequist wants to merge 2 commits intocorundex:mainfrom
brosequist:rocm-7.2.2-pytorch-2.10.0

Conversation

@brosequist
Copy link
Copy Markdown

Summary

Upgrades the base image from ROCm 6.x to ROCm 7.2.2 / PyTorch 2.10.0, adding native RDNA4 support and resolving several dependency issues.

This supersedes PR #5 (thank you @Griznah for the ROCm 7.x groundwork!), extending it further to 7.2.2 + PyTorch 2.10.0.

Changes

Base image

  • rocm/pytorch:rocm7.2.2_ubuntu24.04_py3.12_pytorch_release_2.10.0
  • Python 3.12, PyTorch 2.10.0, ROCm 7.2.2

torchaudio — now works out of the box

  • torchaudio 2.10.0+rocm7.2.2 ships pre-built in the AMD base image
  • No separate install, index-url workaround, or try/except guard needed
  • Custom nodes that use torchaudio (audio/video pipelines) just work

onnxruntime

  • AMD renamed onnxruntime_rocmonnxruntime_migraphx starting in ROCm 7.x
  • Install via official AMD wheel from repo.radeon.com
  • Provides the same onnxruntime Python namespace (drop-in replacement)

RDNA4 native support (RX 9000 series)

  • ROCm 7.2.2 adds compiled HIP kernels for gfx1200 (RX 9060 XT) and gfx1201 (RX 9070 XT)
  • HSA_OVERRIDE_GFX_VERSION workaround is no longer needed
  • Tested hardware: RX 9060 XT (gfx1200), RX 9070 XT (gfx1201), RX 7900 XTX (gfx1100)

numpy

  • PyTorch 2.10.0 is numpy 2.x compatible — removed the numpy<2 pin

requirements_rocm.txt

  • Added requests (needed by ComfyUI and several custom nodes)
  • Added piexif, soundfile
  • Pinned transformers==5.6.1, tokenizers==0.22.2 for ComfyUI compatibility
  • Reorganized with clearer section comments

ComfyUI version

  • Pinned to v0.19.3

Test plan

  • Docker build completes successfully on ROCm 7.2.2 base image
  • torch.cuda.is_available() returns True in the built container
  • torchaudio.__version__ reports 2.10.0+rocm7.2.2.git5047768f without any install step
  • torchvision imports cleanly
  • ComfyUI starts and serves on port 8188
  • Tested on AMD RX 9070 XT (gfx1201/RDNA4) — no HSA_OVERRIDE_GFX_VERSION needed

🤖 Generated with Claude Code

- Base image: rocm/pytorch:rocm7.2.2_ubuntu24.04_py3.12_pytorch_release_2.10.0
- ComfyUI: v0.19.3
- torchaudio 2.10.0+rocm7.2.2 ships pre-built in base image — no separate
  install or workaround needed
- Replace onnxruntime_rocm with onnxruntime_migraphx (AMD renamed package in
  ROCm 7.x, same 'onnxruntime' Python namespace, provides MIGraphX backend)
- RDNA4 (gfx1200/gfx1201) supported natively — no HSA_OVERRIDE_GFX_VERSION
  workaround required
- numpy 2.x compatible with PyTorch 2.10.0 — removed numpy<2 pin
- Update requirements: add requests, piexif, soundfile; pin transformers==5.6.1
  and tokenizers==0.22.2 for ComfyUI compatibility
- Update build.sh detailed tag format, README version table, hardware table,
  and troubleshooting guide
- Credit Griznah (Ole-Magnus Sæther) for ROCm 7.x update groundwork (PR corundex#5)

Supersedes PR corundex#5 — extends Griznah's ROCm 7.1.1 base to 7.2.2 + PyTorch 2.10.0
@brosequist
Copy link
Copy Markdown
Author

Additional findings from production deployment (k3s + dual RDNA4 GPUs)

Sharing a few things discovered while deploying this image in a k3s cluster with both RX 9070 XT (gfx1201) + RX 9060 XT (gfx1200):

Dual-GPU setup

Set HIP_VISIBLE_DEVICES=0,1 to expose both cards. Useful for GPU-to-GPU offloading when VRAM is tight — some custom nodes (e.g. WanVideoWrapper) let you explicitly pick cuda:1 as the offload device.

Kubernetes resource claim: amd.com/gpu: "2" with the amdgpu-device-plugin.

WanVideoWrapper multitalk import ordering patch

With torchaudio shipping pre-built in the base image, ComfyUI-WanVideoWrapper's nodes_sampler.py has an import ordering bug — from .multitalk.multitalk_loop import multitalk_loop appears at module level before the try/except MULTITALK_AVAILABLE block, causing an import error on startup.

Workaround applied in our startup script:

SAMPLER=custom_nodes/ComfyUI-WanVideoWrapper/nodes_sampler.py
if [ -f "$SAMPLER" ] && ! grep -q MULTITALK_AVAILABLE "$SAMPLER"; then
  sed -i 's/^        MULTITALK_AVAILABLE = True$/    from .multitalk.multitalk_loop import multitalk_loop\n    MULTITALK_AVAILABLE = True/' "$SAMPLER"
  sed -i '/^from \.multitalk\.multitalk_loop import multitalk_loop$/d' "$SAMPLER"
fi

This moves the import inside the try block where it belongs. A fix upstream in WanVideoWrapper would be the proper solution.

ComfyUI version string patch

The release/vX.Y.Z branches ship with a stale version string in comfyui_version.py and pyproject.toml (points to an older version). ComfyUI-Manager compares this to the latest release and shows "upgrade available" on every restart even when you're on the latest.

If you check out the latest release branch at startup (rather than pinning in the image), patch the version strings to match:

VER=$(echo "$LATEST_REL" | grep -oE '[0-9]+\.[0-9]+\.[0-9]+')
sed -i "s/__version__ = \".*\"/__version__ = \"${VER}\"/" comfyui_version.py
sed -i "s/^version = \".*\"/version = \"${VER}\"/" pyproject.toml

onnxruntime_migraphx — gfx1200/gfx1201 works

Confirmed onnxruntime_migraphx works correctly on RDNA4 hardware without any additional configuration.

--lowvram recommended for stability

Especially with Wan 2.2 video generation on RDNA4 — without it we saw occasional HIP segfaults. The --lowvram flag significantly improves stability with large models.

@SWZ128
Copy link
Copy Markdown

SWZ128 commented May 4, 2026

Hey, thanks for the PR — I gave this a try and ran into a startup issue.
Right after building and docker compose up, the container keeps crashing with:

ModuleNotFoundError: No module named 'comfy_aimdo'

2026-05-04 18-54-03
docker compose up log
comfyui-rocm  | WARNING:root:WARNING: blake3 package not installed
comfyui-rocm  | Traceback (most recent call last):
comfyui-rocm  |   File "/workspace/ComfyUI/main.py", line 34, in <module>
comfyui-rocm  |     import comfy_aimdo.control
comfyui-rocm  | ModuleNotFoundError: No module named 'comfy_aimdo'

Tracked it down to requirements_rocm.txt — it's missing a few packages that ComfyUI v0.19.3 now imports at module level in main.py:

import comfy_aimdo.control

https://github.com/Comfy-Org/ComfyUI/blob/3086026401180c9216bcb6ace442a4e3587d2c66/main.py#L34

https://github.com/Comfy-Org/ComfyUI/blob/3086026401180c9216bcb6ace442a4e3587d2c66/requirements.txt#L26

The upstream requirements.txt includes comfy-aimdo>=0.2.12 (and comfy-kitchen and blake3), but they didn't make it into the ROCm version of the file. Adding them back fixes the crash:

 pydantic~=2.0
 pydantic-settings~=2.0
 
+comfy-aimdo>=0.2.12
+comfy-kitchen>=0.2.8
+simpleeval>=1.0.0
+blake3
+filelock
+
 # Image / video / audio
 kornia>=0.7.1

After adding those, another warning shows up — PyOpenGL and glfw are also missing, which breaks comfy_extras/nodes_glsl.py. The upstream requirements has these too, just listed under "non essential dependencies".

After first fix — OpenGL warning
[+] up 1/1
 ✔ Container comfyui-rocm Recreated                                                                                                                                                                                                0.9s
Attaching to comfyui-rocm
comfyui-rocm  | [ComfyUI] Model downloader starting (MODEL_DOWNLOAD=default)
comfyui-rocm  | [ComfyUI] Downloading default models (1 total)...
comfyui-rocm  | [ComfyUI] Stable Diffusion 1.5 already exists, skipping
comfyui-rocm  | [ComfyUI] default downloads completed: 1/1 successful
comfyui-rocm  | [ComfyUI] Model downloader completed successfully
comfyui-rocm  | [ComfyUI] Starting ComfyUI on port 8188...
comfyui-rocm  | Found comfy_kitchen backend eager: {'available': True, 'disabled': False, 'unavailable_reason': None, 'capabilities': ['apply_rope', 'apply_rope1', 'dequantize_mxfp8', 'dequantize_nvfp4', 'dequantize_per_tensor_fp8', 'quantize_mxfp8', 'quantize_nvfp4', 'quantize_per_tensor_fp8', 'scaled_mm_mxfp8', 'scaled_mm_nvfp4']}
comfyui-rocm  | Found comfy_kitchen backend triton: {'available': True, 'disabled': True, 'unavailable_reason': None, 'capabilities': ['apply_rope', 'apply_rope1', 'dequantize_nvfp4', 'dequantize_per_tensor_fp8', 'quantize_mxfp8', 'quantize_nvfp4', 'quantize_per_tensor_fp8']}
comfyui-rocm  | Found comfy_kitchen backend cuda: {'available': True, 'disabled': True, 'unavailable_reason': None, 'capabilities': ['apply_rope', 'apply_rope1', 'dequantize_nvfp4', 'dequantize_per_tensor_fp8', 'quantize_mxfp8', 'quantize_nvfp4', 'quantize_per_tensor_fp8']}
comfyui-rocm  | Checkpoint files will always be loaded safely.
comfyui-rocm  | Total VRAM 16368 MB, total RAM 31680 MB
comfyui-rocm  | pytorch version: 2.10.0+rocm7.2.2.git40d237bf
comfyui-rocm  | Set: torch.backends.cudnn.enabled = False for better AMD performance.
comfyui-rocm  | AMD arch: gfx1101
comfyui-rocm  | ROCm version: (7, 2)
comfyui-rocm  | Set vram state to: NORMAL_VRAM
comfyui-rocm  | Device: cuda:0 AMD Radeon RX 7800 XT : native
comfyui-rocm  | Using async weight offloading with 2 streams
comfyui-rocm  | Enabled pinned memory 28511.0
comfyui-rocm  | Using pytorch attention
comfyui-rocm  | Python version: 3.12.3 (main, Mar  3 2026, 12:15:18) [GCC 13.3.0]
comfyui-rocm  | ComfyUI version: 0.19.3
comfyui-rocm  | comfy-aimdo version: 0.3.0
comfyui-rocm  | comfy-kitchen version: 0.2.8
comfyui-rocm  | ComfyUI frontend version: 1.43.1
comfyui-rocm  | [Prompt Server] web root: /opt/venv/lib/python3.12/site-packages/comfyui_frontend_package/static
comfyui-rocm  | Asset seeder disabled
comfyui-rocm  | Traceback (most recent call last):
comfyui-rocm  |   File "/workspace/ComfyUI/nodes.py", line 2227, in load_custom_node
comfyui-rocm  |     module_spec.loader.exec_module(module)
comfyui-rocm  |   File "<frozen importlib._bootstrap_external>", line 995, in exec_module
comfyui-rocm  |   File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
comfyui-rocm  |   File "/workspace/ComfyUI/comfy_extras/nodes_glsl.py", line 64, in <module>
comfyui-rocm  |     _check_opengl_availability()
comfyui-rocm  |   File "/workspace/ComfyUI/comfy_extras/nodes_glsl.py", line 35, in _check_opengl_availability
comfyui-rocm  |     raise RuntimeError(
comfyui-rocm  | RuntimeError: OpenGL dependencies not available.
comfyui-rocm  | Please install the updated requirements.txt file by running:
comfyui-rocm  | /opt/venv/bin/python -m pip install -r /workspace/ComfyUI/requirements.txt
comfyui-rocm  | If you are on the portable package you can run: update\update_comfyui.bat to solve this problem.
comfyui-rocm  | 
comfyui-rocm  | 
comfyui-rocm  | Cannot import /workspace/ComfyUI/comfy_extras/nodes_glsl.py module for custom nodes: OpenGL dependencies not available.
comfyui-rocm  | Please install the updated requirements.txt file by running:
comfyui-rocm  | /opt/venv/bin/python -m pip install -r /workspace/ComfyUI/requirements.txt
comfyui-rocm  | If you are on the portable package you can run: update\update_comfyui.bat to solve this problem.
comfyui-rocm  | 
comfyui-rocm  | WARNING: some comfy_extras/ nodes did not import correctly. This may be because they are missing some dependencies.
comfyui-rocm  | 
comfyui-rocm  | IMPORT FAILED: nodes_glsl.py
comfyui-rocm  | 
comfyui-rocm  | This issue might be caused by new missing dependencies added the last time you updated ComfyUI.
comfyui-rocm  | Please do a: pip install -r requirements.txt
comfyui-rocm  | 
comfyui-rocm  | Context impl SQLiteImpl.
comfyui-rocm  | Will assume non-transactional DDL.
comfyui-rocm  | Context impl SQLiteImpl.
comfyui-rocm  | Will assume non-transactional DDL.
comfyui-rocm  | Running upgrade  -> 0001_assets, Initial assets schema
comfyui-rocm  | Revision ID: 0001_assets
comfyui-rocm  | Revises: None
comfyui-rocm  | Create Date: 2025-12-10 00:00:00
comfyui-rocm  | Running upgrade 0001_assets -> 0002_merge_to_asset_references, Merge AssetInfo and AssetCacheState into unified asset_references table.
comfyui-rocm  | Running upgrade 0002_merge_to_asset_references -> 0003_add_metadata_job_id, Add system_metadata and job_id columns to asset_references.
comfyui-rocm  | Change preview_id FK from assets.id to asset_references.id.
comfyui-rocm  | Database upgraded from None to 0003_add_metadata_job_id
comfyui-rocm  | Starting server
comfyui-rocm  | 
comfyui-rocm  | To see the GUI go to: http://0.0.0.0:8188

Here's the complete fix for requirements_rocm.txt:

+comfy-aimdo>=0.2.12
+comfy-kitchen>=0.2.8
+simpleeval>=1.0.0
+blake3
+filelock
+
+# OpenGL
+PyOpenGL
+glfw
+
 # Image / video / audio
 kornia>=0.7.1

With these additions everything starts cleanly. Tested on RX 7800 XT

ComfyUI v0.19.3 imports `comfy_aimdo` and friends at module level in
main.py. The upstream requirements.txt lists these, but they were
missing from the ROCm-specific requirements_rocm.txt — so containers
built from this PR crash on startup with:

    ModuleNotFoundError: No module named 'comfy_aimdo'

Add the missing packages (comfy-aimdo, comfy-kitchen, simpleeval,
blake3, filelock) to match upstream requirements.txt. Reported by
@SWZ128 in PR corundex#6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@brosequist
Copy link
Copy Markdown
Author

Thanks for catching this @SWZ128 — pushed 3d2ea55 adding comfy-aimdo, comfy-kitchen, simpleeval, blake3, and filelock to requirements_rocm.txt to match upstream. Should fix the startup crash. Appreciate the diagnosis with the line links.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants