Skip to content

Conversation

@charludo
Copy link
Contributor

@charludo charludo commented Nov 5, 2025

#1896 (comment) was surprisingly simple to implement. From my testing, both all pod-known GPU(s) and the corresponding Nvidia libs are handed to any container in the pod with

- env:
  - name: NVIDIA_VISIBLE_DEVICES
    value: all

set.

We could potentially expand on this by parsing the actual value of that envvar and using it to only hand through specific GPUs, but

  • that would (afaict) require specifying the VFIO num from the host CDI config, so not super portable,
  • and there's probably not really a usecase for this (?)

I am also, unfortunately, pretty sure that this is a "forever-patch" and we will not get this merged into upstream kata.

@charludo charludo added the no changelog PRs not listed in the release notes label Nov 5, 2025
@charludo charludo requested a review from thomasten November 5, 2025 14:16
@github-actions
Copy link

github-actions bot commented Nov 5, 2025

Pre-release artifacts on a298ee6

The pre-release artifacts for this commit are available at the following link:

https://contrast-public.s3.eu-central-1.amazonaws.com/pr-artifacts/1762355874/

Created by @thomasten in pr_release_artifacts workflow.

@katexochen katexochen added bug fix Fixing a user facing bug and removed no changelog PRs not listed in the release notes labels Nov 5, 2025
@katexochen katexochen added this to the v1.15.0 milestone Nov 5, 2025
Copy link
Member

@thomasten thomasten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and works with Privatemode :)

WithImage("ghcr.io/edgelesssys/contrast/ubuntu:24.04@sha256:0f9e2b7901aa01cf394f9e1af69387e2fd4ee256fd8a95fb9ce3ae87375a31e6").
WithCommand("/bin/sh", "-c", "sleep inf").
WithEnv(EnvVar().
WithName("NVIDIA_VISIBLE_DEVICES").WithValue("all"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a third container that has the env not set and check that it has no access to the GPU?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix Fixing a user facing bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants