spspmm lead to error: PyTorch CUDA error: an illegal memory access was encountered.

Hi, I'm having the same problem with #174.
I have two large adjacency matrices, the details are as follows
adj_l
SparseTensor(row=tensor([     0,      0,      0,  ..., 736388, 736388, 736388], device='cuda:2'),
             col=tensor([  145,  2215,  3205,  ..., 21458, 22283, 31934], device='cuda:2'),
             val=tensor([0.0909, 0.0909, 0.0909,  ..., 0.1000, 0.1000, 0.1000], device='cuda:2'),
             size=(736389, 59965), nnz=7505078, density=0.02%)
adj_r
SparseTensor(row=tensor([    0,     0,     0,  ..., 59962, 59963, 59964], device='cuda:2'),
             col=tensor([222683, 370067, 430465,  ...,  38176, 514545, 334613], device='cuda:2'),
             val=tensor([0.1429, 0.1429, 0.1429,  ..., 0.5000, 1.0000, 1.0000], device='cuda:2'),
             size=(59965, 736389), nnz=7505078, density=0.02%)

Convert them to sparse format and use the following code,
                                     rowA, colA, _ = adj_l.coo()
                                     rowB, colB, _ = adj_r.coo()
                                     indexA = torch. stack((rowA,colA))
                                     indexB = torch. stack((rowB,colB))
                                     valueA = adj_l.storage._value
                                     valueB = adj_r.storage._value
                                     indexC, valueC = spspmm(indexA, valueA, indexB, valueB, adj_l.size(0), adj_l.size(1), adj_r.size(1), coalesced=True)
Then an error will be reported. CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Even with CUDA_LAUNCH_BLOCKING=1. There is no more information, I believe this is caused by too much memory for the two sparse matrices. Is there any way to run it on gpu?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

spspmm lead to error: PyTorch CUDA error: an illegal memory access was encountered. #314

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

spspmm lead to error: PyTorch CUDA error: an illegal memory access was encountered. #314

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions