Summary
Inside the cloudy-pad container, /dev/nvidia-modeset is created with mode 0644, so the unprivileged cloudy user can only open it O_RDONLY. NVIDIA's X11 Vulkan WSI requires O_RDWR, so any X11 Vulkan swapchain creation fails. This breaks every Vulkan-on-X11 game (DX12 via vkd3d-proton, native Vulkan, vkcube), while OpenGL/GLX is unaffected because it doesn't go through the modeset uAPI.
Symptoms
- DX12 game window opens with audio but renders black; vkd3d-proton logs
Presenter: Failed to query present modes: -13 (i.e. VK_ERROR_UNKNOWN from vkGetPhysicalDeviceSurfacePresentModesKHR).
vkcube aborts immediately: cube.c:1225: demo_prepare_buffers: Assertion '!err' failed.
vkcube --gpu_number 1 (llvmpipe) works.
vulkaninfo's Presentable Surfaces: section is empty for the NVIDIA GPU.
glxinfo / glxgears work fine on the T4.
Root cause
$ ls -l /dev/nvidia*
crw-rw-rw- 1 root root 195, 0 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 /dev/nvidiactl
crw-r--r-- 1 root root 195, 254 /dev/nvidia-modeset # <-- 0644
crw-rw-rw- 1 root root 234, 0 /dev/nvidia-uvm
crw-rw-rw- 1 root root 234, 1 /dev/nvidia-uvm-tools
$ python3 -c "import os; os.open('/dev/nvidia-modeset', os.O_RDWR)"
PermissionError: [Errno 13] Permission denied
The kernel module's own default is 0666 (/proc/driver/nvidia/params: DeviceFileMode: 438), so the 0644 is set by libnvidia-container when it mknods the node into the container's tmpfs /dev — it treats nvidia-modeset as a "control-only" device and doesn't anticipate an unprivileged in-container user needing RDWR for Vulkan WSI.
Verification of the fix
$ sudo chmod 0666 /dev/nvidia-modeset
$ vkcube # runs and renders
$ vulkaninfo | grep -A3 'Presentable Surfaces'
Presentable Surfaces:
GPU id : 0 (Tesla T4):
VK_KHR_xcb_surface
VK_KHR_xlib_surface
Suggested fix
Add a one-liner to cloudy/bin/[setup-container-post-start.sh](http://setup-container-post-start.sh/) (which already runs as root via supervisord at container start):
chmod 0666 /dev/nvidia-modeset
Environment
- Host: AWS EC2 g4dn (Tesla T4), Ubuntu cloud image, kernel 6.17.0-1012-aws
- NVIDIA driver: 590.48.01 (open kernel module), userspace 590.48.01 inside container — versions match
- Container runtime: NVIDIA Container Toolkit (libnvidia-container, evidenced by bind-mounts of
/usr/lib/x86_64-linux-gnu/libnvidia-* and /usr/bin/nvidia-*)
- X server: display
:42, NVIDIA proprietary DDX, virtual 1920x1080 via Option "ConnectedMonitor" "DP-0" + MetaModes
- Vulkan loader: 1.3.275, ICD
/etc/vulkan/icd.d/nvidia_icd.json
Summary
Inside the cloudy-pad container,
/dev/nvidia-modesetis created with mode0644, so the unprivilegedcloudyuser can only open itO_RDONLY. NVIDIA's X11 Vulkan WSI requiresO_RDWR, so any X11 Vulkan swapchain creation fails. This breaks every Vulkan-on-X11 game (DX12 via vkd3d-proton, native Vulkan, vkcube), while OpenGL/GLX is unaffected because it doesn't go through the modeset uAPI.Symptoms
Presenter: Failed to query present modes: -13(i.e.VK_ERROR_UNKNOWNfromvkGetPhysicalDeviceSurfacePresentModesKHR).vkcubeaborts immediately:cube.c:1225: demo_prepare_buffers: Assertion '!err' failed.vkcube --gpu_number 1(llvmpipe) works.vulkaninfo'sPresentable Surfaces:section is empty for the NVIDIA GPU.glxinfo/glxgearswork fine on the T4.Root cause
The kernel module's own default is
0666(/proc/driver/nvidia/params:DeviceFileMode: 438), so the0644is set bylibnvidia-containerwhen itmknods the node into the container's tmpfs/dev— it treatsnvidia-modesetas a "control-only" device and doesn't anticipate an unprivileged in-container user needing RDWR for Vulkan WSI.Verification of the fix
Suggested fix
Add a one-liner to
cloudy/bin/[setup-container-post-start.sh](http://setup-container-post-start.sh/)(which already runs as root via supervisord at container start):Environment
/usr/lib/x86_64-linux-gnu/libnvidia-*and/usr/bin/nvidia-*):42, NVIDIA proprietary DDX, virtual 1920x1080 viaOption "ConnectedMonitor" "DP-0"+MetaModes/etc/vulkan/icd.d/nvidia_icd.json