build: gate amdgpu plugin on amdgpu dev support#2970
build: gate amdgpu plugin on amdgpu dev support#2970cheese-cakee wants to merge 2 commits intocheckpoint-restore:criu-devfrom
Conversation
Check for libdrm amdgpu headers/libraries before enabling CONFIG_AMDGPU. This avoids build failures on systems where libdrm is installed without amdgpu support. Fixes: checkpoint-restore#1900 Signed-off-by: Farzan Aman Khan <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR fixes build failures when libdrm is installed without amdgpu development support (issue #1900). Previously, CONFIG_AMDGPU was gated on a generic pkg-config-check,libdrm, which succeeded even without amdgpu headers/libraries.
Changes:
- Added a
FEATURE_TEST_LIBDRM_AMDGPUcompile test that verifies amdgpu headers and linkability - Replaced the
pkg-config-check,libdrmgate with atry-ccfeature test using-ldrm -ldrm_amdgpu - Updated the help message to mention "amdgpu support"
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| scripts/feature-tests.mak | New FEATURE_TEST_LIBDRM_AMDGPU compile test checking amdgpu headers and API |
| Makefile.config | Gate CONFIG_AMDGPU on the new feature test instead of generic libdrm pkg-config |
You can also share your feedback on Copilot code review. Take the survey.
Include pkg-config libdrm cflags in FEATURE_TEST_LIBDRM_AMDGPU detection so xf86drm/drm.h include paths are resolved correctly on distro layouts that require -I/usr/include/libdrm. Signed-off-by: Farzan Aman Khan <[email protected]>
|
@cheese-cakee To build the AMD GPU plugin, we currently require
Would you be able to provide more information on how to reproduce this error? |
|
Thanks for the question @rst0git . Repro is the one documented in #1900 : |
@cheese-cakee This issue was opened in 2022, and the code has changed significantly since then. Were you able to reproduce the error yourself? |
|
I reproduced this on current code in a clean Ubuntu 24.04 container. What I tested:
Observed results:
So with libdrm detected but missing AMDGPU dev support, current logic still enables AMDGPU, while this patch gates it correctly. |
|
for clarity: docker run --rm -it \
-v "$PWD:/src:ro" ubuntu:24.04 bash -lc '
set -eux
apt-get update -qq
DEBIAN_FRONTEND=noninteractive apt-get install -y -qq build-essential pkg-config libdrm-dev
cp -a /src /work
# Simulate: libdrm present, AMDGPU dev bits missing
mv /usr/include/libdrm/amdgpu.h /usr/include/libdrm/amdgpu.h.bak || true
mv /usr/lib/x86_64-linux-gnu/libdrm_amdgpu.so /usr/lib/x86_64-linux-gnu/libdrm_amdgpu.so.bak || true
# Current criu-dev behavior: generic libdrm check still enables AMDGPU
(cd /work && make -pn 2>/dev/null | grep -m1 CONFIG_AMDGPU || true)
# This then fails in amdgpu plugin compilation
(gcc $(pkg-config --cflags libdrm) -c /work/plugins/amdgpu/amdgpu_plugin.c -o /tmp/amdgpu.o) || true
' |
These lines look like intentionally breaking parts of the AMD GPU userspace stack :) |
|
@rst0git I reran this with a package-native setup (no manual deletion of distro files):
Results:
So this reproduces the exact mismatch using a real package build configuration, without mutating installed distro package contents. |
|
Adding exact code context that this PR changes: Old gate (current ifeq ($(call pkg-config-check,libdrm),y)
export CONFIG_AMDGPU := y
endifThis only checks generic New gate ( LIBDRM_CFLAGS := $(shell $(PKG_CONFIG) --cflags libdrm 2>/dev/null)
ifeq ($(call try-cc,$(FEATURE_TEST_LIBDRM_AMDGPU),$(LIBDRM_CFLAGS) -ldrm -ldrm_amdgpu),true)
export CONFIG_AMDGPU := y
endifFeature test ( #include <xf86drm.h>
#include <libdrm/amdgpu.h>
...
return amdgpu_device_initialize(-1, &major, &minor, &dev);So we now require both headers and linkability to AMDGPU userspace before enabling the plugin. |
|
Hi @fdavid-amd, following up on the review request from @avagin. The change is a surgical build-detection fix: adds a compile-time check for libdrm/amdgpu.h and ldrm_amdgpu before enabling CONFIG_AMDGPU. Happy to answer any questions. |
Problem
CONFIG_AMDGPU was enabled whenever libdrm was detected, even on systems where libdrm is present without amdgpu development support (libdrm/amdgpu.h / -ldrm_amdgpu).
This causes build failures in plugins/amdgpu/amdgpu_plugin.c.
Change
Notes
This is a surgical build-detection fix with no runtime logic changes.
Fixes: #1900