Skip to content

Always provide service to generate CDI specs #507

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

arnaldo2792
Copy link
Contributor

Description of changes:

The service to generate the CDI specs is required in both ECS and Kubernetes. For the latter, the CDI specs generated by the service will be used by "privileged" containers that must have access to all the devices.

Testing done:

Verified the service is installed in k8s variants and that it is functional:

bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "0968c0610-dirty",
    "pretty_name": "Bottlerocket OS 1.39.0 (aws-k8s-1.31-nvidia)",
    "variant_id": "aws-k8s-1.31-nvidia",
    "version_id": "1.39.0"
  }
}
bash-5.1# systemctl status generate-cdi-specs.service
● generate-cdi-specs.service - Generate CDI specifications
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/generate-cdi-specs.service; enabled; preset: enabled)
    Drop-In: /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/service.d
             └─00-aws-config.conf
     Active: active (exited) since Wed 2025-05-14 17:47:13 UTC; 53s ago
   Main PID: 1823 (code=exited, status=0/SUCCESS)
        CPU: 64ms

May 14 17:47:13 ip-172-31-20-71.us-west-2.compute.internal nvidia-ctk[1823]: time="2025-05-14T17:47:13Z" level=info msg="Selecting /lib/firmware/nvidia/570.133.20…a10x.bin"
May 14 17:47:13 ip-172-31-20-71.us-west-2.compute.internal nvidia-ctk[1823]: time="2025-05-14T17:47:13Z" level=info msg="Selecting /lib/firmware/nvidia/570.133.20…u10x.bin"
May 14 17:47:13 ip-172-31-20-71.us-west-2.compute.internal nvidia-ctk[1823]: time="2025-05-14T17:47:13Z" level=info msg="Selecting /usr/bin/nvidia-smi as /usr/bin…idia-smi"
May 14 17:47:13 ip-172-31-20-71.us-west-2.compute.internal nvidia-ctk[1823]: time="2025-05-14T17:47:13Z" level=info msg="Selecting /usr/bin/nvidia-debugdump as /u…ebugdump"
May 14 17:47:13 ip-172-31-20-71.us-west-2.compute.internal nvidia-ctk[1823]: time="2025-05-14T17:47:13Z" level=info msg="Selecting /usr/bin/nvidia-persistenced as…istenced"
May 14 17:47:13 ip-172-31-20-71.us-west-2.compute.internal nvidia-ctk[1823]: time="2025-05-14T17:47:13Z" level=warning msg="Could not locate nvidia-cuda-mps-contr…ot found"
May 14 17:47:13 ip-172-31-20-71.us-west-2.compute.internal nvidia-ctk[1823]: time="2025-05-14T17:47:13Z" level=warning msg="Could not locate nvidia-cuda-mps-serve…ot found"
May 14 17:47:13 ip-172-31-20-71.us-west-2.compute.internal nvidia-ctk[1823]: time="2025-05-14T17:47:13Z" level=warning msg="Could not locate nvidia_drv.so: patter…ot found"
May 14 17:47:13 ip-172-31-20-71.us-west-2.compute.internal nvidia-ctk[1823]: time="2025-05-14T17:47:13Z" level=warning msg="Could not locate libglxserver_nvidia.s…ot found"
May 14 17:47:13 ip-172-31-20-71.us-west-2.compute.internal nvidia-ctk[1823]: time="2025-05-14T17:47:13Z" level=info msg="Generated CDI spec with version 0.8.0"
Hint: Some lines were ellipsized, use -l to show in full.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

The service to generate the CDI specs is required in both ECS and
Kubernetes. For the latter, the CDI specs generated by the service
will be used by "privileged" containers that must have access to all
the devices.

Signed-off-by: Arnaldo Garcia Rincon <[email protected]>
@arnaldo2792 arnaldo2792 requested review from ytsssun and rpkelly May 14, 2025 17:53
@arnaldo2792 arnaldo2792 merged commit 02b5cda into bottlerocket-os:develop May 14, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants