Skip to content

Feat 1413 AMD gtt memory usage added with backup for non ROCm#1512

Open
CowboyTim wants to merge 5 commits intoaristocratos:mainfrom
aardbeiplantje:feat-1413-amd-gtt-mem-usage
Open

Feat 1413 AMD gtt memory usage added with backup for non ROCm#1512
CowboyTim wants to merge 5 commits intoaristocratos:mainfrom
aardbeiplantje:feat-1413-amd-gtt-mem-usage

Conversation

@CowboyTim
Copy link
Copy Markdown

This adds GTT memory usage for AMD GPUs that support it. It uses ROCm, as that implementation was already there, with a fallback to using sysfs to also implement #956 possibly (I cant test this).

I've tested both with ROCm, and without. I've also tested on a regular Intel i7-7567U that has no AMD/NVIDIA, but integrated graphics.

@deckstose deckstose added the gpu Issues or pull requests related to GPU functionality label Jan 26, 2026
@TheSovietPancakes
Copy link
Copy Markdown
Contributor

I feel like these features should be separate (sysfs, gtt), since they feel like solutions to majorly different problems, but otherwise your implementation of GTT with ROCm looks perfect. I will have to test the sysfs later on my RX 9070 XT, but the GTT matches a prototype I made a while ago. 👍

@CowboyTim
Copy link
Copy Markdown
Author

Note that, in the mean time I found out that on an AMD Strix HALO with UMA arch, you can also "disable" the GTT memory usage and everything becomes dynamic. Then the memory usage from llama.cpp is added with the regular process memory, and it's the total memory that just changes Frree/Cached/Available. Total is also the maximum at that point.

Initially I assumed this was because the MES was disabled, but it's not only in that case aparently, there are more flags/features. It's also annoying to test, as this the whole setup is unstable. It's basically a giant puzzle until things don't crash that often anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gpu Issues or pull requests related to GPU functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants