|
1 | 1 | --- |
2 | 2 | name: rocm-doctor |
3 | 3 | description: >- |
4 | | - Diagnoses why ROCm, the HIP SDK, PyTorch, or llama.cpp fails on an AMD |
5 | | - GPU by matching symptoms against a closed catalog of known |
6 | | - misconfigurations on Linux and Windows, then either applies a low-risk |
7 | | - fix with consent or hands back the exact next step. Also routes |
8 | | - Lemonade, LM Studio, and Ollama users to the right upstream channel. |
9 | | - Use when the user says "ROCm/HIP isn't working", "torch.cuda.is_available() |
10 | | - is False on Radeon/Ryzen AI", "rocminfo can't find my GPU", |
11 | | - "hipInfo.exe can't see my Radeon", "amdhip64_6.dll could not be found", |
12 | | - "vcruntime140_1.dll missing", "HIP SDK installer left things broken", |
13 | | - "Adrenalin driver too old for the HIP SDK", "hipErrorNoBinaryForGpu", |
14 | | - "HSA_STATUS_ERROR_INVALID_ISA", "invalid device function", |
15 | | - "Unable to open /dev/kfd", "ROCk module is NOT loaded", |
16 | | - "libamdhip64.so cannot open shared object file", "ROCm wheel doesn't |
17 | | - see my gfx1151/gfx1150/gfx1103 (Strix Halo, Phoenix)", "iGPU/dGPU |
18 | | - collision", "multi-GPU hang on AMD"; or mentions HSA_OVERRIDE_GFX_VERSION, |
19 | | - HIP_VISIBLE_DEVICES, HIP_PATH, PYTORCH_ROCM_ARCH, render-group / /dev/kfd |
20 | | - permissions, amdgpu blacklist, Secure Boot, or asks where to file a |
21 | | - Lemonade / LM Studio / Ollama issue. Do NOT use for non-AMD GPUs, |
22 | | - fresh installs, performance tuning, or ROCm-on-WSL2. |
| 4 | + Diagnoses why ROCm, the HIP SDK, PyTorch, or llama.cpp is broken on an |
| 5 | + AMD GPU on Linux or Windows, and either applies a low-risk fix with |
| 6 | + consent or hands back the exact next step. Also routes Lemonade, LM |
| 7 | + Studio, and Ollama issues to the right upstream channel. Use when the |
| 8 | + user reports that ROCm or HIP isn't working, torch.cuda.is_available() |
| 9 | + is False Ryzen AI, rocminfo or hipInfo can't see the GPU, |
| 10 | + or hits hipErrorNoBinaryForGpu, |
| 11 | + HSA_STATUS_ERROR_INVALID_ISA, invalid device function, missing |
| 12 | + amdhip64_6.dll, vcruntime140_1.dll, or libamdhip64.so, cannot open |
| 13 | + /dev/kfd, ROCk module not loaded, an Adrenalin driver too old for the |
| 14 | + HIP SDK, or a ROCm wheel that doesn't recognize gfx1151, gfx1150, or |
| 15 | + gfx1103; or mentions HSA_OVERRIDE_GFX_VERSION, |
| 16 | + HIP_VISIBLE_DEVICES, PYTORCH_ROCM_ARCH, render-group permissions, |
| 17 | + amdgpu blacklist, Secure Boot, iGPU/dGPU collisions, or multi-GPU |
| 18 | + hangs. Do not use for non-AMD GPUs, performance |
| 19 | + tuning, or ROCm-on-WSL2. |
23 | 20 | --- |
24 | 21 |
|
25 | 22 | # ROCm Doctor |
@@ -62,7 +59,7 @@ Out of scope: |
62 | 59 | Adrenalin Pro plus the WSL kernel update on the Windows host -- those |
63 | 60 | failure modes are not in this catalog. `examine.py` detects WSL via |
64 | 61 | `/proc/version` and exits 2 with a route-out message; if the user wants |
65 | | - WSL specifically, point them at <https://rocm.docs.amd.com/projects/install-on-wsl/en/latest/>. |
| 62 | + WSL specifically, point them at <https://rocm.docs.amd.com/projects/radeon-ryzen/en/latest/docs/install/installryz/wsl/howto_wsl.html>. |
66 | 63 |
|
67 | 64 | ## Prerequisites |
68 | 65 |
|
|
0 commit comments