Description
Hi, I'm having the same problem with #174.
I have two large adjacency matrices, the details are as follows
adj_l
SparseTensor(row=tensor([ 0, 0, 0, ..., 736388, 736388, 736388], device='cuda:2'),
col=tensor([ 145, 2215, 3205, ..., 21458, 22283, 31934], device='cuda:2'),
val=tensor([0.0909, 0.0909, 0.0909, ..., 0.1000, 0.1000, 0.1000], device='cuda:2'),
size=(736389, 59965), nnz=7505078, density=0.02%)
adj_r
SparseTensor(row=tensor([ 0, 0, 0, ..., 59962, 59963, 59964], device='cuda:2'),
col=tensor([222683, 370067, 430465, ..., 38176, 514545, 334613], device='cuda:2'),
val=tensor([0.1429, 0.1429, 0.1429, ..., 0.5000, 1.0000, 1.0000], device='cuda:2'),
size=(59965, 736389), nnz=7505078, density=0.02%)
Convert them to sparse format and use the following code,
rowA, colA, _ = adj_l.coo()
rowB, colB, _ = adj_r.coo()
indexA = torch. stack((rowA,colA))
indexB = torch. stack((rowB,colB))
valueA = adj_l.storage._value
valueB = adj_r.storage._value
indexC, valueC = spspmm(indexA, valueA, indexB, valueB, adj_l.size(0), adj_l.size(1), adj_r.size(1), coalesced=True)
Then an error will be reported. CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Even with CUDA_LAUNCH_BLOCKING=1. There is no more information, I believe this is caused by too much memory for the two sparse matrices. Is there any way to run it on gpu?
Activity