CUDA/CuDNN related errors occur in Titan-RTX environments

hello. 

I changed my environment in many ways,
but I couldn't get a solution for running your code...

First, my GPU is Titan-RTX
and my attempts are follows.

I also tried to run the code on CUDA 8.0 environments before, but the errors occurs as
almost same as on CUDA 9.0 environments

____________________________________________________________________________________________
1)   ---environment---
       ubuntu 18.04
       CUDA 9.0
       CuDNN 7.1
       torch 0.3.1 / 0.4.0 
==> 
error message : 
Found GPU0 TITAN RTX which requires CUDA_VERSION >= 9000 for
     optimal performance and fast startup time, but your PyTorch was compiled
     with CUDA_VERSION 8000. Please install the correct PyTorch binary
     using instructions from http://pytorch.org

  warnings.warn(incorrect_binary_warn % (d, name, 9000, CUDA_VERSION))

and process is "Killed" when data are load to the gpu, specifically operating conv2d() command in  
55 line of pointnet2_modules.py, self.mlp[i] - _PointnetSAModuleBase function

2)    ---environment---
       ubuntu 18.04
       CUDA 9.0
       CuDNN 7.1
       torch 0.3.1 / 0.4.1 
==>
error message :
RuntimeError: cuda runtime error (11) : invalid argument at /pytorch/aten/src/THC/THCGeneral.cpp:663

3)    ---environment---
       ubuntu 18.04
       CUDA 9.0
       CuDNN 7.1
       torch 0.3.1 / 0.4.1 

and I additionally revised train_cls.py as

torch.backends.cudnn.benchmark = False


==>
Traceback (most recent call last):
  File "train_cls.py", line 217, in <module>
    main()
  File "train_cls.py", line 125, in main
    train(train_dataloader, test_dataloader, model, criterion, optimizer, lr_scheduler, bnm_scheduler, args, num_batch)
  File "train_cls.py", line 167, in train
    pred = model(points)
  File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/SSD1/dogyoon/Relation-Shape-CNN-master/models/rscnn_ssn_cls.py", line 102, in forward
    return self.FC_layer(features.squeeze(-1))
  File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/container.py", line 91, in forward
    input = module(input)
  File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/container.py", line 91, in forward
    input = module(input)
  File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/container.py", line 91, in forward
    input = module(input)
  File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/batchnorm.py", line 66, in forward
    exponential_average_factor, self.eps)
  File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/functional.py", line 1251, in batch_norm
    raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size [1, 512]


____________________________________________________________________________________________

I really hope to find the solution of this problem as soon as possible
thank you very much


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA/CuDNN related errors occur in Titan-RTX environments #39

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

CUDA/CuDNN related errors occur in Titan-RTX environments #39

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions