Skip to content

[Issue]: generate.py: error: argument --target_gpus: expected at least one argument if only unsupported GPUs are selected #169

@ScottTodd

Description

@ScottTodd

Problem Description

Summary

We're seeing build errors like https://github.com/ROCm/TheRock/actions/runs/25201744067/job/73894021744:

-- [AOTriton] Skipping triton due to AOTRITON_NOIMAGE_MODE
CMAKE_SOURCE_DIR /__w/TheRock/TheRock/external-builds/pytorch/pytorch/build/aotriton/src/aotriton_runtime
CMAKE_CURRENT_SOURCE_DIR /__w/TheRock/TheRock/external-builds/pytorch/pytorch/build/aotriton/src/aotriton_runtime/v3src
CMAKE_CURRENT_SOURCE_PARENT_DIR /__w/TheRock/TheRock/external-builds/pytorch/pytorch/build/aotriton/src/aotriton_runtime
CMAKE_CURRENT_LIST_DIR /__w/TheRock/TheRock/external-builds/pytorch/pytorch/build/aotriton/src/aotriton_runtime/v3src
CMAKE_CURRENT_BINARY_DIR /__w/TheRock/TheRock/external-builds/pytorch/pytorch/build/aotriton/src/aotriton_runtime-build/v3src
-- AOTRITON_TARGET_ARCH gfx900
-- AOTRITON_OVERRIDE_TARGET_GPUS 
-- EFFECTIVE_TARGET_GPUS 
AOTRITON_COMPILER /__w/TheRock/TheRock/external-builds/pytorch/pytorch/build/aotriton/src/aotriton_runtime/v3python/compile.py
usage: generate.py [-h]
'/opt/_internal/cpython-3.12.10/lib/python3.12/site-packages/cmake/data/bin/cmake' '-E' 'env' 'VIRTUAL_ENV=/__w/TheRock/TheRock/external-builds/pytorch/pytorch/build/aotriton/src/aotriton_runtime-build/venv' 'AOTRITON_ENABLE_FP32=1' '/__w/TheRock/TheRock/external-builds/pytorch/pytorch/build/aotriton/src/aotriton_runtime-build/venv/bin/python' '-X' 'utf8' '-m' 'v3python.generate' '--target_gpus' '--build_dir' '/__w/TheRock/TheRock/external-builds/pytorch/pytorch/build/aotriton/src/aotriton_runtime-build/v3src' '--noimage_mode'
                   [--target_gpus {gfx90a_mod0,gfx942_mod0,gfx950_mod0,gfx1100_mod0,gfx1101_mod0,gfx1102_mod0,gfx1151_mod0,gfx1150_mod0,gfx1201_mod0,gfx1200_mod0,gfx1250_mod0} [{gfx90a_mod0,gfx942_mod0,gfx950_mod0,gfx1100_mod0,gfx1101_mod0,gfx1102_mod0,gfx1151_mod0,gfx1150_mod0,gfx1201_mod0,gfx1200_mod0,gfx1250_mod0} ...]]
                   [--build_dir BUILD_DIR] [--root_dir ROOT_DIR]
                   [--archive_only] [--library_suffix LIBRARY_SUFFIX]
                   [--noimage_mode] [--build_for_tuning]
                   [--build_for_tuning_second_pass]
                   [--build_for_tuning_but_skip_kernel [BUILD_FOR_TUNING_BUT_SKIP_KERNEL ...]]
                   [--verbose] [--lut_sanity_check]
generate.py: error: argument --target_gpus: expected at least one argument
CMake Error at v3src/CMakeLists.txt:71 (execute_process):
  execute_process failed command indexes:

    1: "Child return code: 2"
-- Configuring incomplete, errors occurred!

Context

In https://github.com/ROCm/TheRock we build pytorch using https://github.com/ROCm/TheRock/blob/main/external-builds/pytorch/build_prod_wheels.py in a few configurations:

  • Per-family releases: separate builds for PYTORCH_ROCM_ARCH=gfx900, PYTORCH_ROCM_ARCH=gfx1151, PYTORCH_ROCM_ARCH=gfx942, etc.
  • Multi-arch releases (new): one single build for PYTORCH_ROCM_ARCH=gfx1100;gfx1101;gfx1102;gfx1103;gfx1151;gfx1200;gfx1201;gfx900;gfx906;gfx908;gfx90a;gfx1010;gfx1011;gfx1012;gfx1030;gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036;gfx1150;gfx1152;gfx1153

See also:

Analysis and remediation

If there is at least one supported architecture in the list then target filtering and fallback behavior appears to be working. If there are only unsupported architectures in the list then we hit the above error.

For per-family releases in TheRock we solved this by disabling flash attention and aotriton entirely if any unsupported architecture was included. I'm going to invert this for multi-arch releases to disable flash attention and aotriton only if all architectures are unsupported, now that we see how the filtering and fallback code paths are working.

If aotriton is patched to not error during generate.py (and produce a library where check_gpu() always returns failure) for target lists including only unsupported architectures then we can remove our downstream filtering in TheRock entirely.

Operating System

Linux and Windows

Steps to Reproduce

Build pytorch building with USE_FLASH_ATTENTION=ON and PYTORCH_ROCM_ARCH=gfx900 (or any set of archs with no aotriton support)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions