Skip to content

cuDNN library not fully used on Windows #80

Open
@QueensGambit

Description

@QueensGambit

The library cudnn_cnn_infer64_8.dll is not used on Windows, but libcudnn_cnn_infer.so.8 is used on Linux.
This seems to make a visible NPS difference.

e.g. Ubuntu 18.04:

GPU: RTX 2070 OC

isready
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-1.onnx
info string deserialize engine: model/chess/model-bsize1-fp16-0.trt
info string inputDims: (1, 39, 8, 8)
info string valueOutputDims: (1, 1)
info string policyOutputDims: (1, 4864)
info string No auxiliary outputs detected.
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-16.onnx
info string deserialize engine: model/chess/model-bsize16-fp16-0.trt
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-16.onnx
info string deserialize engine: model/chess/model-bsize16-fp16-0.trt
info string inputDims: (16, 39, 8, 8)
info string valueOutputDims: (16, 1)
info string policyOutputDims: (16, 4864)
info string No auxiliary outputs detected.
readyok
go infinite
info string create new tree
info string run mcts search
info depth 17 seldepth 28 multipv 1 score cp 47 nodes 18522 nps 18485 tbhits 0 time 1002 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2
info depth 19 seldepth 31 multipv 1 score cp 47 nodes 38347 nps 19154 tbhits 0 time 2002 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2 b8d7 e1g1
info depth 19 seldepth 37 multipv 1 score cp 47 nodes 57007 nps 18990 tbhits 0 time 3002 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2 b8d7 e1g1

GPU-Utility: 91%

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04    Driver Version: 460.27.04    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 207...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   48C    P2   152W / 215W |    677MiB /  7982MiB |     91%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:0B:00.0  On |                  N/A |
|  0%   54C    P2    56W / 250W |    441MiB / 11177MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

e.g. Windows 10:

GPU: RTX 2070 OC

isready
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-1.onnx
info string deserialize engine: model/chess/model-bsize1-fp16-0.trt
info string inputDims: (1, 39, 8, 8)
info string valueOutputDims: (1, 1)
info string policyOutputDims: (1, 4864)
info string No auxiliary outputs detected.
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-16.onnx
info string deserialize engine: model/chess/model-bsize16-fp16-0.trt
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-16.onnx
info string deserialize engine: model/chess/model-bsize16-fp16-0.trt
info string inputDims: (16, 39, 8, 8)
info string valueOutputDims: (16, 1)
info string policyOutputDims: (16, 4864)
info string No auxiliary outputs detected.
readyok
go infinite
info string create new tree
info string run mcts search
info depth 17 seldepth 28 multipv 1 score cp 47 nodes 16500 nps 16369 tbhits 0 time 1008 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2
info depth 19 seldepth 31 multipv 1 score cp 47 nodes 33367 nps 16584 tbhits 0 time 2012 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2 b8d7 e1g1
info depth 19 seldepth 33 multipv 1 score cp 47 nodes 50400 nps 16617 tbhits 0 time 3033 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2 b8d7 e1g1

GPU-Utility: 85%

C:\Windows\System32\DriverStore\FileRepository\nv_dispui.inf_amd64_c1f8f32cc9af9677>nvidia-smi
Fri Apr  9 17:37:37 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 461.33       Driver Version: 461.33       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 207... WDDM  | 00000000:01:00.0 Off |                  N/A |
| 29%   59C    P2   141W / 215W |    845MiB /  8192MiB |     85%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108... WDDM  | 00000000:0B:00.0  On |                  N/A |
|  0%   38C    P8    17W / 250W |    692MiB / 11264MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

Metadata

Metadata

Assignees

No one assigned

    Labels

    Windowsissues specific to windows userscuDNNIssue about cuDNN library

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions