Name	Name	Last commit message	Last commit date
parent directory ..
profiles	profiles
skyhook_dir	skyhook_dir
CHANGELOG.md	CHANGELOG.md
Dockerfile	Dockerfile
README.md	README.md
config.json	config.json
preprocess.sh	preprocess.sh

NVIDIA Tuned Package

A NodeWright package that extends the base tuned package with NVIDIA-specific performance profiles for GPU and DGX systems.

Overview

This package inherits from the base tuned package and adds pre-configured tuned profiles optimized for NVIDIA hardware. The profiles are organized by:

Common base profiles: Foundational settings deployed to /usr/lib/tuned/
OS-specific workload profiles: Profiles that may vary by OS version
Service profiles: Service-specific settings (eks, GCP, etc.)

The configmap uses an intent-based model where you specify what you want (intent + accelerator) rather than a specific profile name. The profile name nvidia-{accelerator}-{intent} is constructed automatically. When accelerator=generic, the self-contained nvidia-generic profile is used instead, providing safe baseline tuning for any NVIDIA GPU without requiring accelerator-specific or intent-specific configuration.

Supported Operating Systems

This package requires tuned >= 2.19. The following operating systems are supported:

OS	Version	Status	Notes
Ubuntu	22.04 (Jammy)	✅ Tested	Uses a min of OS-specific and common profiles
Ubuntu	24.04 (Noble)	✅ Tested	Uses common profiles
Debian	11 (Bullseye)	❌	Default tuned version is too old (2.15)
Debian	12 (Bookworm)	⚠️ verified tuned package version but not fully tested	Uses common profiles
RHEL	9	⚠️ verified tuned package version but not fully tested	Uses common profiles
Other	Any	⚠️ Fallback	Falls back to `os/common/` profiles (untested, requires tuned >= 2.19)

Notes

Tested OS versions: These have been validated with the package and use OS-specific profile configurations
Fallback behavior: For untested OS versions, the package will automatically fall back to the os/common/ profiles. This fallback is untested and requires the system to have tuned >= 2.19 installed
Tuned version requirement: All systems must have tuned version 2.19 or later. Check your system's tuned version with tuned --version
OS detection: The package automatically detects the OS from /etc/os-release and selects the appropriate profiles

Directory Structure

profiles/
├── common/                  # Base profiles → /usr/lib/tuned/
│   ├── nvidia-base/
│   └── nvidia-acs-disable/
├── os/
│   ├── common/              # Default workload profiles (fallback for untested OS)
│   │   ├── nvidia-generic/             # Self-contained baseline (accelerator=generic)
│   │   ├── nvidia-h100-performance/
│   │   ├── nvidia-h100-inference/
│   │   ├── nvidia-h100-multiNodeTraining/
│   │   ├── nvidia-gb200-performance/
│   │   ├── nvidia-gb200-inference/
│   │   └── nvidia-gb200-multiNodeTraining/
│   ├── ubuntu/
│   │   ├── 22.04/          # Mix of symlinks and OS-specific overrides
│   │   └── 24.04/          # Symlinks to os/common/ (override when needed)
│   ├── debian/
│   │   ├── 11/             # Mix of symlinks and OS-specific overrides
│   │   └── 12/             # Symlinks to os/common/ (override when needed)
│   └── rhel/
│       └── 9/              # Symlinks to os/common/ (override when needed)
└── service/
    ├── common/                  # Shared helpers copied into every service's final profile dir
    │   ├── mac-address-policy.sh
    │   └── bootloader.sh
    ├── eks/
    │   ├── tuned.conf.template  # Service template (include= added dynamically)
    │   ├── script.sh            # Sources common/mac-address-policy.sh, invokes common/bootloader.sh
    │   ├── nvidia-h100-inference.conf   # AWS-compatible inference override
    │   └── nvidia-gb200-inference.conf
    └── aks/
        ├── tuned.conf.template
        ├── script.sh            # Sources common/mac-address-policy.sh, invokes common/bootloader.sh
        └── nvidia-h100-inference.conf   # AKS-compatible inference override (drops kernel-6.8 EEVDF sysctls)

Note: Profiles are stored in profiles/ (not root_dir/) to avoid polluting the host filesystem during package extraction. The prepare scripts explicitly copy profiles to the appropriate tuned directories.

How It Works

Prepare stage: prepare_nvidia_profiles.sh runs:
- Reads intent and accelerator from the configmap
- Constructs the profile name as nvidia-{accelerator}-{intent}
- Deploys common base profiles to /usr/lib/tuned/
- Detects OS from /etc/os-release
- Copies the appropriate OS-specific workload profiles to /etc/tuned/
- If a service is specified, creates service profile with dynamic include= pointing to the workload profile
Config stage: The inherited tuned package applies the configured profile

Profile Name Construction

The profile name is built from the configmap fields:

nvidia-{accelerator}-{intent}

Examples:

`accelerator`	`intent`	Constructed Profile
`generic`	(ignored)	`nvidia-generic`
`h100`	`performance`	`nvidia-h100-performance`
`h100`	`inference`	`nvidia-h100-inference`
`h100`	`multiNodeTraining`	`nvidia-h100-multiNodeTraining`
`gb200`	`performance`	`nvidia-gb200-performance`
`gb200`	`inference`	`nvidia-gb200-inference`
`gb200`	`multiNodeTraining`	`nvidia-gb200-multiNodeTraining`

When accelerator=generic, the nvidia-generic profile is selected directly. The intent and service fields are ignored. This profile is self-contained (no include chain) and provides universally safe GPU tuning suitable for any NVIDIA GPU.

Inheritance Chain

When you specify intent: inference, accelerator: h100, and service: eks:

eks (active profile)
  └── includes: nvidia-h100-inference
        └── includes: nvidia-h100-performance
              └── includes: nvidia-acs-disable
                    └── includes: nvidia-base

Usage

Generic tuning (any NVIDIA GPU, no accelerator-specific or intent-specific config):

apiVersion: skyhook.nvidia.com/v1alpha1
kind: Skyhook
metadata:
  name: nvidia-tuned-generic
spec:
  nodeSelectors:
    matchLabels:
      nvidia.com/gpu.present: "true"
  packages:
    nvidia-tuned:
      image: ghcr.io/nvidia/skyhook-packages/nvidia-tuned
      version: 0.3.0
      interrupt:
        type: reboot
      env:
        - name: INTERRUPT
          value: "true"
      configMap:
        accelerator: generic

Accelerator-specific tuning (with intent and service):

apiVersion: skyhook.nvidia.com/v1alpha1
kind: Skyhook
metadata:
  name: nvidia-tuned-eks
spec:
  nodeSelectors:
    matchLabels:
      nvidia.com/dgx: "true"
  packages:
    nvidia-tuned:
      image: ghcr.io/nvidia/skyhook-packages/nvidia-tuned
      version: 0.3.0
      interrupt:
        type: reboot
      configInterrupts:
        intent:
          type: reboot
      env:
        - name: INTERRUPT
          value: "true"
      configMap:
        intent: inference
        accelerator: h100
        service: eks

AKS tuning (H100 on Azure Kubernetes Service, Ubuntu 24.04):

apiVersion: skyhook.nvidia.com/v1alpha1
kind: Skyhook
metadata:
  name: nvidia-tuned-aks
spec:
  nodeSelectors:
    matchLabels:
      nvidia.com/gpu.present: "true"
  packages:
    nvidia-tuned:
      image: ghcr.io/nvidia/skyhook-packages/nvidia-tuned
      version: 0.3.0
      interrupt:
        type: reboot
      configInterrupts:
        intent:
          type: reboot
      env:
        - name: INTERRUPT
          value: "true"
      configMap:
        intent: inference
        accelerator: h100
        service: aks

ConfigMap Fields

Field	Required	Default	Description
`accelerator`	Yes	—	GPU/accelerator type (e.g., `h100`, `gb200`, `generic`). When set to `generic`, intent and service are ignored
`intent`	No	`performance`	Workload intent (e.g., `inference`, `performance`, `multiNodeTraining`). Ignored when `accelerator=generic`
`service`	No	—	Service name (e.g., `eks`). If specified, service profile wraps the workload profile. Ignored when `accelerator=generic`

Available Profiles

Intents (specify in `intent`)

Intent	Description
`performance`	General GPU performance optimization
`inference`	Optimized for inference workloads (CPU isolation, hugepages)
`multiNodeTraining`	Optimized for distributed training (network buffers, TCP tuning)

Accelerators (specify in `accelerator`)

Accelerator	Description
`generic`	Baseline tuning for any NVIDIA GPU (self-contained, no intent/service required)
`h100`	NVIDIA H100 GPU
`gb200`	NVIDIA GB200 GPU

Services (specify in `service`)

Service	Description
`eks`	eks-specific settings (MAC address policy for CNI)
`aks`	aks-specific settings (MAC address policy, grub.d bootloader workaround for Ubuntu)

Adding OS-Specific Overrides

By default, OS version directories contain symlinks to os/common/. To add OS-specific settings:

Remove the symlink: rm profiles/os/ubuntu/24.04/nvidia-h100-inference
Create directory: mkdir profiles/os/ubuntu/24.04/nvidia-h100-inference
Add custom tuned.conf with OS-specific settings

Verification

After deployment, verify the profile is active:

# List available profiles (should include nvidia-* profiles)
tuned-adm list

# Check active profile
tuned-adm active

# Verify tuning is applied
tuned-adm verify

Inheritance

This package inherits all functionality from the base tuned package:

Multi-distribution support (Ubuntu/Debian, CentOS/RHEL/Amazon Linux)
Custom profile deployment via configmaps
Script deployment for complex tuning logic
Full lifecycle management (install, configure, uninstall)

See the tuned package README for complete documentation on all features.

Version

Package Version: 0.3.0
Base Package: tuned (latest via preprocess.sh)
Schema Version: v1

Additional documentation

NVIDA Grace Performance Tuning Guide

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

NVIDIA Tuned Package

Overview

Supported Operating Systems

Notes

Directory Structure

How It Works

Profile Name Construction

Inheritance Chain

Usage

ConfigMap Fields

Available Profiles

Intents (specify in `intent`)

Accelerators (specify in `accelerator`)

Services (specify in `service`)

Adding OS-Specific Overrides

Verification

Inheritance

Version

Additional documentation

FilesExpand file tree

nvidia-tuned

Directory actions

More options

Directory actions

More options

Latest commit

History

nvidia-tuned

Folders and files

parent directory

README.md

NVIDIA Tuned Package

Overview

Supported Operating Systems

Notes

Directory Structure

How It Works

Profile Name Construction

Inheritance Chain

Usage

ConfigMap Fields

Available Profiles

Intents (specify in intent)

Accelerators (specify in accelerator)

Services (specify in service)

Adding OS-Specific Overrides

Verification

Inheritance

Version

Additional documentation

Intents (specify in `intent`)

Accelerators (specify in `accelerator`)

Services (specify in `service`)