RuntimeError: CUDA error: invalid configuration argument  

              I was running the code on A6000 with distributed training, using outdoor large-scale point cloud data. When I set batch_size to 2 on each GPU, an error occurred at the 25rd epoch. I tried increasing the batch_size to 3, but an error occurred at the 63rd epoch.

```python
"/mnt/sdb/anaconda3/envs/pointcept/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/mnt/sdb/anaconda3/envs/pointcept/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/sdb/anaconda3/envs/pointcept/lib/python3.8/site-packages/torchsparse-2.1.0-py3.8-linux-x86_64.egg/torchsparse/nn/modules/conv.py", line 98, in forward
    return F.conv3d(
  File "/mnt/sdb/anaconda3/envs/pointcept/lib/python3.8/site-packages/torchsparse-2.1.0-py3.8-linux-x86_64.egg/torchsparse/nn/functional/conv/conv.py", line 92, in conv3d
    kmap = F.build_kernel_map(
  File "/mnt/sdb/anaconda3/envs/pointcept/lib/python3.8/site-packages/torchsparse-2.1.0-py3.8-linux-x86_64.egg/torchsparse/nn/functional/conv/kmap/build_kmap.py", line 193, in build_kernel_map
    out_in_map_bwd = F.convert_transposed_out_in_map(
  File "/mnt/sdb/anaconda3/envs/pointcept/lib/python3.8/site-packages/torchsparse-2.1.0-py3.8-linux-x86_64.egg/torchsparse/nn/functional/conv/hash/query.py", line 48, in convert_transposed_out_in_map
    out_in_map_t = torch.full(
RuntimeError: CUDA error: invalid configuration argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
```

_Originally posted by @lihc-cz in https://github.com/mit-han-lab/torchsparse/issues/16#issuecomment-2566996364_
            

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError: CUDA error: invalid configuration argument #342

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError: CUDA error: invalid configuration argument #342

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions