You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A sample `web-config.yaml` file can be fetched from [exporter-toolkit repository](https://github.com/prometheus/exporter-toolkit/blob/master/docs/web-config.yml). The reference of the `web-config.yaml` file can be consulted in the [docs](https://github.com/prometheus/exporter-toolkit/blob/master/docs/web-configuration.md).
94
94
95
+
### IPv6 Support
96
+
97
+
DCGM-Exporter supports IPv6 addresses for both the remote hostengine connection (`-r`) and the metrics listen address (`-a`). IPv6 addresses must use bracket notation when combined with a port.
98
+
99
+
#### Remote Hostengine (CLI)
100
+
101
+
```shell
102
+
dcgm-exporter -r "[::1]:5555"
103
+
```
104
+
105
+
#### Remote Hostengine (Environment Variable)
106
+
107
+
```shell
108
+
export DCGM_REMOTE_HOSTENGINE_INFO="[::1]:5555"
109
+
dcgm-exporter
110
+
```
111
+
112
+
#### Metrics Listen Address
113
+
114
+
```shell
115
+
dcgm-exporter -a "[::]:9400"
116
+
```
117
+
118
+
**Note:** The brackets in `[::1]:5555` are required by the DCGM connection protocol. When using the CLI, the shell requires quoting (double or single quotes) around the address to prevent bracket interpretation.
119
+
120
+
#### Prerequisites
121
+
122
+
The remote `nv-hostengine` must be configured to listen on IPv6. Refer to the [DCGM documentation](https://docs.nvidia.com/datacenter/dcgm/latest/) for configuring `nv-hostengine` bind address options.
123
+
95
124
### How to include HPC jobs in metric labels
96
125
97
126
The DCGM-exporter can include High-Performance Computing (HPC) job information into its metric labels. To achieve this, HPC environment administrators must configure their HPC environment to generate files that map GPUs to HPC jobs.
@@ -164,6 +193,10 @@ Notes:
164
193
* Always make sure your entries have 2 commas (',')
165
194
* The complete list of counters that can be collected can be found on the DCGM API reference manual: <https://docs.nvidia.com/datacenter/dcgm/latest/dcgm-api/dcgm-api-field-ids.html>
166
195
196
+
### Profiling Metrics
197
+
198
+
Please note that for Ampere and earlier generation GPUs, profiling metrics depend on the datacenter-gpu-manager-4-proprietary package. This package is included in the container.
199
+
167
200
### What about a Grafana Dashboard?
168
201
169
202
You can find the official NVIDIA DCGM-Exporter dashboard here: <https://grafana.com/grafana/dashboards/12239>
0 commit comments