Skip to content

Commit 6481867

Browse files
committed
feat(rocm): Add HSA_OVERRIDE_GFX_VERSION=11.0.0 to PyTorch ROCm test
- Set HSA_OVERRIDE_GFX_VERSION=11.0.0 environment variable for gfx1103 GPU support - Switch from direct Python execution to sh wrapper for environment control - Add environment variable display in test output - Update error message to reference HSA override workaround - Add result sample output for verification This enables PyTorch ROCm compute on AMD Phoenix1 (gfx1103) integrated GPUs by overriding the HSA runtime's GFX version check. Test results with override: ✅ Tensor creation succeeds ✅ GPU computation succeeds ✅ PyTorch ROCm test PASSES on integrated AMD GPU
1 parent 4e101cd commit 6481867

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

scripts/deploy-test.sh

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1389,15 +1389,19 @@ jobs:
13891389
repository: rocm/pytorch
13901390
tag: latest
13911391
run:
1392-
path: python3
1392+
path: sh
13931393
args:
13941394
- -c
13951395
- |
1396+
export HSA_OVERRIDE_GFX_VERSION=11.0.0
1397+
python3 <<'PYTHON_EOF'
13961398
import torch
13971399
import traceback
1400+
import os
13981401
print("=" * 60)
13991402
print("PyTorch ROCm Test")
14001403
print("=" * 60)
1404+
print(f"HSA_OVERRIDE_GFX_VERSION: {os.environ.get('HSA_OVERRIDE_GFX_VERSION', 'not set')}")
14011405
print(f"PyTorch version: {torch.__version__}")
14021406
print(f"CUDA available (ROCm): {torch.cuda.is_available()}")
14031407
if torch.cuda.is_available():
@@ -1413,19 +1417,21 @@ jobs:
14131417
print("Attempting GPU computation (multiply by 2)...")
14141418
y = x * 2
14151419
print(f"✓ Computation succeeded, result shape: {y.shape}")
1416-
print("\n✓ PyTorch ROCm test PASSED")
1420+
print(f"✓ Result sample: {y[0]}")
1421+
print("\n✓ PyTorch ROCm test PASSED!")
14171422
except Exception as e:
14181423
print(f"\n✗ PyTorch ROCm test FAILED")
14191424
print(f"Error type: {type(e).__name__}")
14201425
print(f"Error message: {str(e)}")
14211426
print("\nFull traceback:")
14221427
traceback.print_exc()
1423-
print("\nNote: Integrated AMD GPUs (APUs) are not officially supported by ROCm for compute workloads")
1428+
print("\nNote: Try setting HSA_OVERRIDE_GFX_VERSION=11.0.0 for gfx1103 (Phoenix1) GPUs")
14241429
exit(1)
14251430
else:
14261431
print("⚠ ROCm not available")
14271432
print("See hardware info above for diagnostics")
14281433
exit(1)
1434+
PYTHON_EOF
14291435
EOF
14301436

14311437
./fly -t pytorch set-pipeline -p pytorch-rocm -c pytorch-rocm-pipeline.yml -n

0 commit comments

Comments
 (0)