Skip to content

Commit 140f5e6

Browse files
committed
Enable resource naming in config
1 parent cb6e45e commit 140f5e6

File tree

3 files changed

+183
-2
lines changed

3 files changed

+183
-2
lines changed

README.md

+4
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@
22

33
[![End-to-end Tests](https://github.com/NVIDIA/k8s-device-plugin/actions/workflows/e2e.yaml/badge.svg)](https://github.com/NVIDIA/k8s-device-plugin/actions/workflows/e2e.yaml) [![Go Report Card](https://goreportcard.com/badge/github.com/NVIDIA/k8s-device-plugin)](https://goreportcard.com/report/github.com/NVIDIA/k8s-device-plugin) [![Latest Release](https://img.shields.io/github/v/release/NVIDIA/k8s-device-plugin)](https://github.com/NVIDIA/k8s-device-plugin/releases/latest)
44

5+
> This fork of the NVIDIA device plugin for Kubernetes has a `release-1.1` branch that is based on the `v0.16.1` tag of the original NVIDIA repository. This version includes all the features and updates available in the `v0.16.1` tag, along with any additional modifications specific to this fork.
6+
>
7+
> For more details on the changes and updates in this release, please refer to the [Documentation](docs/resource-naming/README.md).
8+
59
## Table of Contents
610

711
- [About](#about)

cmd/nvidia-device-plugin/main.go

+8-2
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ import (
3434

3535
spec "github.com/NVIDIA/k8s-device-plugin/api/config/v1"
3636
"github.com/NVIDIA/k8s-device-plugin/internal/info"
37-
"github.com/NVIDIA/k8s-device-plugin/internal/logger"
3837
"github.com/NVIDIA/k8s-device-plugin/internal/plugin"
3938
"github.com/NVIDIA/k8s-device-plugin/internal/rm"
4039
"github.com/NVIDIA/k8s-device-plugin/internal/watch"
@@ -280,7 +279,14 @@ func startPlugins(c *cli.Context, flags []cli.Flag) ([]plugin.Interface, bool, e
280279
if err != nil {
281280
return nil, false, fmt.Errorf("unable to load config: %v", err)
282281
}
283-
spec.DisableResourceNamingInConfig(logger.ToKlog, config)
282+
283+
// This block has been commented out due to issue #69.
284+
// Date: 2024-08-07
285+
// Reason: Commenting out this block allows for the configuration of resource naming.
286+
// This enables the setting of different quotas for various GPU types.
287+
// For more details, see the GitHub issue: https://github.com/volcano-sh/devices/issues/69
288+
289+
// spec.DisableResourceNamingInConfig(logger.ToKlog, config)
284290

285291
driverRoot := root(*config.Flags.Plugin.ContainerDriverRoot)
286292
// We construct an NVML library specifying the path to libnvidia-ml.so.1

docs/resource-naming/README.md

+171
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
# How to configure the Device Plugin to report different GPUs
2+
3+
Volcano v1.9.0 introduces Capacity scheduling capabilities. However, the default Nvidia Device Plugin reports resources as `nvidia.com/gpu`, which does not support reporting different GPU models as shown in the example. To address this, you need to configure three steps:
4+
5+
1. Install a custom Device Plugin
6+
2. Configure DCGM Exporter for Pod-level monitoring
7+
3. Configure Volcano to use the Capacity scheduling plugin
8+
9+
## 1. Install a Custom Device Plugin
10+
11+
### 1.1 Configure GPU Operator and GPU Feature Discovery
12+
13+
Initially, we used the NVIDIA GPU Operator to manage GPU resources uniformly, with GFD and other functions already configured. Since we have NVIDIA drivers installed and need a customized Device Plugin, we need to configure the GPU Operator to enable DCGM Exporter and disable driver and Device Plugin management.
14+
15+
### 1.2 Install a Custom Device Plugin
16+
17+
Volcano provides queue-based resource capabilities, but to report different types of GPUs, the Device Plugin needs to be adapted.
18+
19+
- Related Issue: [Advertising specific GPU types as separate extended resource · Issue #424 · NVIDIA/k8s-device-plugin](https://github.com/NVIDIA/k8s-device-plugin/issues/424)
20+
- Related Code: [k8s-device-plugin/cmd/nvidia-device-plugin/main.go at eb8fd565c3df0caca59bf0ff2ae918e647f46af3 · NVIDIA/k8s-device-plugin](https://github.com/NVIDIA/k8s-device-plugin/blob/eb8fd565c3df0caca59bf0ff2ae918e647f46af3/cmd/nvidia-device-plugin/main.go#L239)
21+
22+
When installing the Device Plugin via Helm, specify the configuration file:
23+
24+
```sh
25+
helm upgrade -i nvdp nvdp/nvidia-device-plugin \
26+
--version=0.15.0 \
27+
--namespace nvidia-device-plugin \
28+
--create-namespace \
29+
--set config.default=other-config \
30+
--set-file config.map.other-config=other-config.yaml \
31+
--set-file config.map.p100-config=p100-config.yaml \
32+
--set-file config.map.v100-config=v100-config.yaml
33+
```
34+
35+
Configuration file content:
36+
37+
```yaml
38+
version: v1
39+
flags:
40+
migStrategy: "none"
41+
failOnInitError: true
42+
nvidiaDriverRoot: "/"
43+
plugin:
44+
passDeviceSpecs: false
45+
deviceListStrategy: envvar
46+
deviceIDStrategy: uuid
47+
resources:
48+
gpus:
49+
- pattern: "Tesla V100-SXM2-32GB"
50+
name: v100
51+
- pattern: "Tesla P100-PCIE-*"
52+
name: p100
53+
- pattern: "NVIDIA GeForce RTX 2080 Ti"
54+
name: 2080ti
55+
- pattern: "NVIDIA TITAN Xp"
56+
name: titan
57+
- pattern: "Tesla T4"
58+
name: t4
59+
```
60+
61+
Modify the Nvidia Device Plugin source code.
62+
63+
Additionally, due to the Go version of my device, I needed to modify the Dockerfile and repackage the image. After modifying and repackaging, replace the Daemonset image with the new version to support marking different types of GPUs as different resources.
64+
65+
### 1.3 Clean Up Outdated Device Plugin Resources
66+
67+
Although we have reported new resources, the previous GPU labels will not disappear:
68+
69+
```sh
70+
kubectl get nodes -ojson | jq '.items[] | {name: .metadata.name, allocatable: .status.allocatable}'
71+
```
72+
73+
Sample output:
74+
75+
```json
76+
{
77+
"name": "huawei-82",
78+
"allocatable": {
79+
"cpu": "80",
80+
"ephemeral-storage": "846624789946",
81+
"hugepages-1Gi": "0",
82+
"hugepages-2Mi": "0",
83+
"memory": "263491632Ki",
84+
"nvidia.com/gpu": "0",
85+
"nvidia.com/t4": "2",
86+
"pods": "110"
87+
}
88+
}
89+
```
90+
91+
Start `kubectl proxy`:
92+
93+
```sh
94+
kubectl proxy
95+
# Starting to serve on 127.0.0.1:8001
96+
```
97+
98+
Deletion script (note / needs to be escaped as ~1):
99+
100+
```bash
101+
#!/bin/bash
102+
103+
# Check if a node name is provided
104+
if [ -z "$1" ]; then
105+
echo "Usage: $0 <node-name>"
106+
exit 1
107+
fi
108+
109+
NODE_NAME=$1
110+
111+
# Prepare the JSON patch data
112+
PATCH_DATA=$(cat <<EOF
113+
[
114+
{"op": "remove", "path": "/status/capacity/nvidia.com~1gpu"}
115+
]
116+
EOF
117+
)
118+
119+
# Execute the PATCH request
120+
curl --header "Content-Type: application/json-patch+json" \
121+
--request PATCH \
122+
--data "$PATCH_DATA" \
123+
http://127.0.0.1:8001/api/v1/nodes/$NODE_NAME/status
124+
125+
echo "Patch request sent for node $NODE_NAME"
126+
```
127+
128+
Pass the Node name and clean up:
129+
130+
```sh
131+
vim patch_node_gpu.sh
132+
./patch_node_gpu.sh huawei-82
133+
```
134+
135+
This completes the first stage: re-reporting GPU resources.
136+
137+
## 2. Configure DCGM Exporter for Pod-Level Monitoring
138+
139+
After changing the GPU resource name, we found that DCGM Exporter could not obtain Pod-level GPU usage metrics. The reason is that DCGM Exporter must fully match the resource name `nvidia.com/gpu` or those with the prefix `nvidia.com/mig-`.
140+
141+
To address this, modify the DCGM Exporter logic, repackage the image, and replace it.
142+
143+
## 3. Configure Volcano to Use the Capacity Scheduling Plugin
144+
145+
Volcano provides a guide titled "How to use capacity plugin", but this guide is not entirely accurate. When configuring the scheduler ConfigMap, you also need to add the reclaim plugin to enable elasticity.
146+
147+
```yaml
148+
kind: ConfigMap
149+
apiVersion: v1
150+
metadata:
151+
name: volcano-scheduler-configmap
152+
namespace: volcano-system
153+
data:
154+
volcano-scheduler.conf: |
155+
actions: "enqueue, allocate, backfill, reclaim" # add reclaim
156+
tiers:
157+
- plugins:
158+
- name: priority
159+
- name: gang
160+
enablePreemptable: false
161+
- name: conformance
162+
- plugins:
163+
- name: drf
164+
enablePreemptable: false
165+
- name: predicates
166+
- name: capacity # add this field and remove proportion plugin.
167+
- name: nodeorder
168+
- name: binpack
169+
```
170+
171+
Additionally, when a Pod requests multiple dimensions of resources (such as CPU, memory, GPU), ensure that each dimension of resources does not exceed the Deserved value to avoid preemption.

0 commit comments

Comments
 (0)