Misc. bug: ggml-cuda: restore prop.integrated for HIP builds — #16308 hardcode breaks iGPU classification and supports_buft for AMD APUs

### Name and Version

version: b9453

### Operating systems

Windows, Linux

### Which llama.cpp modules do you know to be affected?

CUDA/HIP module

### Command line

N/A

### Problem description & steps to reproduce

## Problem

  PR #16308 hardcoded `info.devices[id].integrated = false` for all CUDA/HIP devices to work around corrupted output on Nvidia Jetson Orin Nano (#15034). This is correct for CUDA builds, but it has two unintended side effects for HIP/ROCm builds:

  ### 1. `supports_buft()` reads stale `false` for AMD APU iGPUs (have dGPU)

  `ggml_backend_cuda_device_supports_buft()` (line 5437) reads from `ggml_cuda_info().devices[dev_ctx->device].integrated`, which is always `false` due to the hardcode from PR #16308. This means AMD APU iGPUs never report host buffer support, forcing the discrete-GPU allocation path even on UMA hardware.

  PR #23007 fixed `get_type()` by querying `prop.integrated` directly from `hipGetDeviceProperties()`. This bypassed the cached field, so device classification is now correct. But `supports_buft()` was not updated and still reads `false`. This leaves two sources of truth in the same file that can contradict each other for the same device.

  ### 2. Impact on APU-only systems (no dGPU)

  On a system with only an AMD APU (no discrete GPU), the iGPU is the only compute device. After PR #23007 it gets correctly classified as `GGML_BACKEND_DEVICE_TYPE_IGPU` and added to the device list. But `supports_buft()` still returns `false` for host buffers, forcing incorrect allocation strategy on a UMA device.

  ## Root Cause & Proposed Change

  `hipDeviceProp_t` has an `integrated` field (`int integrated; ///< APU vs dGPU`) that is correctly set to `1` for AMD APU iGPUs. The Jetson Orin Nano corruption (#15034) is CUDA-specific. It stems from a bug in the UMA host-buffer allocation
  path on that device. That path is guarded by the `integrated` flag. I believe there is no evidence the same corruption affects HIP/ROCm builds.

Restore `prop.integrated` for HIP builds only:

  ```cpp
  // ggml-cuda.cu line 249
  #if defined(GGML_USE_HIP)
          info.devices[id].integrated = prop.integrated;
  #else
          info.devices[id].integrated = false; // Temporarily disabled due to issues with corrupted output (e.g. #15034)
  #endif
```

Should any other information and clarifications be necessary, or if this change wouldn't work, please let me know. Just as a note, I built llama.cpp from source with the proposed change above and the fix worked. 

### First Bad Commit

PR #16308

### Relevant log output

The crash would manifest as a segfault during warmup when an AMD APU (iGPU + dGPU) system is present, both devices get classified as discrete GPUs. llama.cpp splits KV cache across them via pipeline parallelism and crashes. You can reference the symptoms from the linked issues (lemonade-sdk/llamacpp-rocm#96 and ROCm/ROCm#6227).



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: ggml-cuda: restore prop.integrated for HIP builds — #16308 hardcode breaks iGPU classification and supports_buft for AMD APUs #23977

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

Problem

1. `supports_buft()` reads stale `false` for AMD APU iGPUs (have dGPU)

2. Impact on APU-only systems (no dGPU)

Root Cause & Proposed Change

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Misc. bug: ggml-cuda: restore prop.integrated for HIP builds — #16308 hardcode breaks iGPU classification and supports_buft for AMD APUs #23977

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

Problem

1. supports_buft() reads stale false for AMD APU iGPUs (have dGPU)

2. Impact on APU-only systems (no dGPU)

Root Cause & Proposed Change

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. `supports_buft()` reads stale `false` for AMD APU iGPUs (have dGPU)