Open
Description
🐛 Describe the bug
For some reason the same code works with CUDA-11.1, but fails with 11.3, which makes me highly suspicious of bug in cuDNN side
cuda-memcheck reports invalid memory access in `cudnnConvolutionForward ` call
(C:\Users\circleci\project\env) C:\Users\circleci\project\test>"c:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\cuda-memcheck.exe" pytest test_transforms_tensor.py -k test_gaussian_blur[1-meth_kwargs1
========= CUDA-MEMCHECK
================================================================================== test session starts ===================================================================================
platform win32 -- Python 3.8.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: C:\Users\circleci\project, configfile: pytest.ini
plugins: cov-3.0.0, mock-3.6.1
collected 3066 items / 3064 deselected / 2 selected
test_transforms_tensor.py .FE [100%]
========================================================================================= ERRORS =========================================================================================
______________________________________________________________ ERROR at teardown of test_gaussian_blur[1-meth_kwargs1-cuda] ______________________________________________________________
Traceback (most recent call last):
File "C:\Users\circleci\project\test\conftest.py", line 104, in prevent_leaking_rng
torch.cuda.set_rng_state(cuda_rng_state)
File "C:\Users\circleci\project\env\lib\site-packages\torch\cuda\random.py", line 64, in set_rng_state
_lazy_call(cb)
File "C:\Users\circleci\project\env\lib\site-packages\torch\cuda\__init__.py", line 155, in _lazy_call
callable()
File "C:\Users\circleci\project\env\lib\site-packages\torch\cuda\random.py", line 62, in cb
default_generator.set_state(new_state_copy)
RuntimeError: CUDA error: unspecified launch failure
======================================================================================== FAILURES ========================================================================================
________________________________________________________________________ test_gaussian_blur[1-meth_kwargs1-cuda] _________________________________________________________________________
Traceback (most recent call last):
File "C:\Users\circleci\project\test\test_transforms_tensor.py", line 963, in test_gaussian_blur
_test_class_op(
File "C:\Users\circleci\project\test\test_transforms_tensor.py", line 85, in _test_class_op
_test_transform_vs_scripted_on_batch(f, scripted_fn, batch_tensors)
File "C:\Users\circleci\project\test\test_transforms_tensor.py", line 36, in _test_transform_vs_scripted_on_batch
transformed_batch = transform(batch_tensors)
File "C:\Users\circleci\project\env\lib\site-packages\torch\nn\modules\module.py", line 1111, in _call_impl
return forward_call(*input, **kwargs)
File "c:\users\circleci\project\torchvision\transforms\transforms.py", line 1817, in forward
return F.gaussian_blur(img, self.kernel_size, [sigma, sigma])
File "c:\users\circleci\project\torchvision\transforms\functional.py", line 1326, in gaussian_blur
output = F_t.gaussian_blur(t_img, kernel_size, sigma)
File "c:\users\circleci\project\torchvision\transforms\functional_tensor.py", line 774, in gaussian_blur
img = conv2d(img, kernel, groups=img.shape[-3])
RuntimeError: CUDA error: unspecified launch failure
================================================================================ short test summary info =================================================================================
ERROR test_transforms_tensor.py::test_gaussian_blur[1-meth_kwargs1-cuda] - RuntimeError: CUDA error: unspecified launch failure
FAILED test_transforms_tensor.py::test_gaussian_blur[1-meth_kwargs1-cuda] - RuntimeError: CUDA error: unspecified launch failure
================================================================= 1 failed, 1 passed, 3064 deselected, 1 error in 35.57s =================================================================
========= Invalid __shared__ read of size 4
========= at 0x00001d10 in volta_scudnn_128x32_3dconv_fprop_xregs_large_nn_v1
========= by thread (95,0,0) in block (24,0,0)
========= Address 0x0000250c is out of bounds
========= Device Frame:volta_scudnn_128x32_3dconv_fprop_xregs_large_nn_v1 (volta_scudnn_128x32_3dconv_fprop_xregs_large_nn_v1 : 0x1d10)
========= Saved host backtrace up to driver entry point at kernel launch time
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nv_dispswi.inf_amd64_8fb2f986cb3224d8\nvcuda64.dll [0x76888]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nv_dispswi.inf_amd64_8fb2f986cb3224d8\nvcuda64.dll [0x76bb1]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nv_dispswi.inf_amd64_8fb2f986cb3224d8\nvcuda64.dll [0x7b0da]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nv_dispswi.inf_amd64_8fb2f986cb3224d8\nvcuda64.dll (cuProfilerStop + 0x11cc6a) [0x33d9ea]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nv_dispswi.inf_amd64_8fb2f986cb3224d8\nvcuda64.dll [0x17069d]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nv_dispswi.inf_amd64_8fb2f986cb3224d8\nvcuda64.dll (cuProfilerStop + 0xf0c72) [0x3119f2]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nv_dispswi.inf_amd64_8fb2f986cb3224d8\nvcuda64.dll [0x38bdb]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nv_dispswi.inf_amd64_8fb2f986cb3224d8\nvcuda64.dll [0x390af]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nv_dispswi.inf_amd64_8fb2f986cb3224d8\nvcuda64.dll [0x39394]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nv_dispswi.inf_amd64_8fb2f986cb3224d8\nvcuda64.dll (cuLaunchKernel + 0x234) [0x20fc44]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll [0x3896]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll [0x26fd]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cask_cudnn::getPlatform + 0xe9) [0x1d54529]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cask_cudnn::transformTensor + 0x1bc1) [0x1dc0651]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cask_cudnn::transformTensor + 0xbe6a) [0x1dca8fa]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cask_cudnn::ConvDgradShader::isSplitK + 0x49b) [0x1ddcd9b]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::backend::Descriptor::initialize_internal + 0x618e) [0x5c67ce]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::backend::Descriptor::initialize_internal + 0x6eb1) [0x5c74f1]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::cnn::EngineInterface::execute + 0x7e) [0x4e163e]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::cnn::EngineContainer<1012,113664>::execute_internal_impl + 0x2a) [0x54f27a]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::cnn::EngineInterface::execute + 0x7e) [0x4e163e]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cask_cudnn::TensorDesc::operator== + 0x2d2) [0x544612]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::cnn::EngineContainer<1,4096>::execute_internal_impl + 0xd241) [0x55c4d1]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::cnn::EngineInterface::execute + 0x7e) [0x4e163e]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::backend::execute + 0x103f) [0x54eebf]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::backend::Tensor::Tensor + 0x18b6) [0x5ab246]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::backend::Tensor::Tensor + 0xbe1) [0x5aa571]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnn::cnn::convolutionForward + 0x10b) [0x65609b]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll (cudnnConvolutionForward + 0x331) [0x657081]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cuda_cpp.dll (at::native::cudnn_convolution_transpose + 0x4263) [0x48863]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cuda_cpp.dll (at::native::cudnn_convolution_transpose + 0x83a7) [0x4c9a7]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cuda_cpp.dll (at::native::cudnn_convolution_transpose + 0x7736) [0x4bd36]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cuda_cpp.dll (at::native::cudnn_convolution_transpose + 0x1ae5) [0x460e5]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cuda_cpp.dll (at::native::cudnn_convolution_transpose + 0x752f) [0x4bb2f]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cuda_cpp.dll (at::native::cudnn_convolution_add_relu + 0x16ec) [0x43fec]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cuda_cpp.dll (at::native::cudnn_convolution + 0xc5) [0x428e5]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cuda_cu.dll (at::cuda::view_as_real + 0x14adc) [0x456680c]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cuda_cu.dll (at::cuda::bucketize_outf + 0x3df7a) [0x450361a]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::_ops::cudnn_convolution::call + 0x242) [0x70175b2]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::native::_convolution + 0xf5e) [0x692064e]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::compositeexplicitautograd::xlogy_ + 0x40e) [0x72d1bee]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::compositeexplicitautograd::bmm + 0x1a1ed) [0x72975bd]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::_ops::_convolution::call + 0x2d6) [0x6d5a226]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::native::convolution + 0x164) [0x6928914]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::compositeexplicitautograd::xlogy_ + 0xc6b) [0x72d244b]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::compositeexplicitautograd::bmm + 0x1a2ca) [0x729769a]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::TensorMaker::make_tensor + 0x88e49) [0x6d40779]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::_ops::convolution::redispatch + 0x123) [0x6dc39a3]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (torch::autograd::GraphRoot::apply + 0x157b1) [0x7bfd851]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (torch::autograd::GraphRoot::apply + 0xc6c8) [0x7bf4768]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::_ops::convolution::call + 0x26f) [0x6d71b6f]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::native::conv2d + 0x1be) [0x69277be]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::compositeimplicitautograd::where + 0x1db4) [0x73ac8e4]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::compositeimplicitautograd::broadcast_to + 0x2a7a3) [0x738d953]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::_ops::conv2d::call + 0x219) [0x70ba239]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_cpu.dll (at::conv2d + 0x64) [0x67106d4]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_python.dll (torch::FunctionSignature::operator= + 0x1096fc) [0x14d77c]
========= Host Frame:C:\Users\circleci\project\env\lib\site-packages\torch\lib\torch_python.dll (torch::FunctionSignature::operator= + 0x12f7ab) [0x17382b]
========= Host Frame:C:\Users\circleci\project\env\python38.dll (PyMethodDef_RawFastCallKeywords + 0x410) [0x126fe0]
========= Host Frame:C:\Users\circleci\project\env\python38.dll (PyObject_MakeTpCall + 0x106) [0x125fa6]
========= Host Frame:C:\Users\circleci\project\env\python38.dll (PyEval_GetFuncDesc + 0x408) [0x2036b8]
=========
...
cc @peterjc123 @nbcsm @guyang3532 @maxluk @gunandrose4u @mszhanyi @vfdev-5 @datumbox
Activity