Misc. bug: ROCm has new DXG connector for WSL2, gives weird free memory amounts (probably not llama.cpp bug, but worth documenting)

### Name and Version

Windows test case:
b9439 (22cadc194) (official Github release binary)
Adrenaline 26.5.2 and ROCm 7.2.4 on Windows 11 25H2 26200.7457

Linux test case
b9439 (22cadc194) (compiled myself)
ROCm 7.2.4 on Debian 13, with librockdxg 8dd7ed

### Operating systems

Windows

### Which llama.cpp modules do you know to be affected?

llama-server, llama-cli

### Command line

```shell
`llama-cli -hf (any model) -ngl all --fit on -lv4`
```

### Problem description & steps to reproduce

ROCm gives different free memory amounts when natively on Windows or inside ROCDXG'd Linux.

Linux llama.cpp built using new librocdxg:
1. [Install ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/install-methods/package-manager-index.html) in your WSL2 Linux VM, and follow instructions. For Debian/Ubuntu, this is: add GPG key and apt repo, `apt install rocm`; you do not need `amdgpu-dkms`.

2. Follow [ROCm Post-install instructions](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/post-install.html). For Debian/Ubuntu, you only need step 1, populate `/etc/ld.so.conf.d/rocm.conf`.

3. Install [Windows SDK](https://learn.microsoft.com/en-us/windows/apps/windows-sdk/downloads) to default path.

4. To build [librocdxg](https://github.com/ROCm/librocdxg/), run:
```sh
git clone https://github.com/ROCm/librocdxg.git
cd librocdxg

export win_sdk='/mnt/c/Program Files (x86)/Windows Kits/10/Include/10.0.28000.0/'
mkdir -p build
cd build
cmake .. -DWIN_SDK="${win_sdk}/shared"
make
sudo make install
```
5. Until ROCm 7.13 comes out, `HSA_ENABLE_DXG_DETECTION=1` must be in your environment, so just `export` it now. Remember to `export` this in any new terminal, or add it to your `~/.bashrc`.

6. `rocminfo | grep Mar` should list your GPU(s).

7. Actually build llama.cpp now; set target arch appropriately:
```sh
HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
    cmake -S . -B build -DGGML_HIP=ON -DGPU_TARGETS=gfx1100 -DCMAKE_BUILD_TYPE=Release \
    && cmake --build build --config Release -- -j 16
```

8. Test llama.cpp, use any model, it just needs to run:
```sh
llama-cli -hf unsloth/Qwen3.6-27B-MTP-GGUF:Q5_K_XL -ngl all --fit on -lv 4
```

On my 7900XTX, in Windows:
```ROCm0 (RX 7900 XTX) | 24560 = 24136```
But in ROCDXG'd Linux:
```ROCm0 (RX 7900 XTX) | 24517 = 21191``` ~3GB less free.

### First Bad Commit

N/A

### Relevant log output

<details>
<summary>Logs</summary>

Native Windows:

```console
0.01.999.610 I common_params_fit_impl: getting device memory data for initial parameters:
0.02.264.328 I common_memory_breakdown_print: | memory breakdown [MiB]  | total    free     self   model   context   compute    unaccounted |
0.02.264.333 I common_memory_breakdown_print: |   - ROCm0 (RX 7900 XTX) | 24560 = 24136 + (35602 = 18563 +   16533 +     505) +      -35179 |
0.02.264.333 I common_memory_breakdown_print: |   - Host                |                   1109 =   833 +       0 +     276                |
0.02.304.398 I common_params_fit_impl: projected to use 35602 MiB of device memory vs. 24136 MiB of free device memory
0.02.304.403 I common_params_fit_impl: cannot meet free memory target of 2160 MiB, need to reduce device memory by 13625 MiB                                                            
0.02.550.717 I common_memory_breakdown_print: | memory breakdown [MiB]  | total    free     self   model   context   compute    unaccounted |
0.02.550.721 I common_memory_breakdown_print: |   - ROCm0 (RX 7900 XTX) | 24560 = 24136 + (19478 = 18563 +     405 +     509) +      -19055 |
0.02.550.721 I common_memory_breakdown_print: |   - Host                |                    857 =   833 +       0 +      24                |
0.02.589.188 I common_params_fit_impl: context size reduced from 262144 to 44032 -> need 13628 MiB less memory in total
```

ROCDXG'd Linux:

```console
0.04.851.240 I common_params_fit_impl: getting device memory data for initial parameters:
0.05.403.254 I common_memory_breakdown_print: | memory breakdown [MiB]  | total    free     self   model   context   compute    unaccounted |
0.05.403.275 I common_memory_breakdown_print: |   - ROCm0 (RX 7900 XTX) | 24517 = 21191 + (35602 = 18563 +   16533 +     505) +      -32276 |
0.05.403.275 I common_memory_breakdown_print: |   - Host                |                   1109 =   833 +       0 +     276                |
0.05.433.198 I common_params_fit_impl: projected to use 35602 MiB of device memory vs. 21191 MiB of free device memory
0.05.433.219 I common_params_fit_impl: cannot meet free memory target of 2160 MiB, need to reduce device memory by 16571 MiB
0.05.966.005 I common_memory_breakdown_print: | memory breakdown [MiB]  | total    free     self   model   context   compute    unaccounted |
0.05.966.025 I common_memory_breakdown_print: |   - ROCm0 (RX 7900 XTX) | 24517 = 21191 + (19478 = 18563 +     405 +     509) +      -16152 |
0.05.966.026 I common_memory_breakdown_print: |   - Host                |                    857 =   833 +       0 +      24                |
0.05.996.940 I common_params_fit_impl: context size reduced from 262144 to 4096 -> need 16123 MiB less memory in total
```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: ROCm has new DXG connector for WSL2, gives weird free memory amounts (probably not llama.cpp bug, but worth documenting) #23999

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Misc. bug: ROCm has new DXG connector for WSL2, gives weird free memory amounts (probably not llama.cpp bug, but worth documenting) #23999

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions