Error during model prediction: CUDA error: unspecified launch failure

Hello, 

Running sage attention operation gives the following exception


```
terminate called after throwing an instance of 'c10::AcceleratorError'
what():  CUDA error: unspecified launch failure
Search for `cudaErrorLaunchFailure' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at /pytorch/c10/cuda/CUDAException.cpp:44 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x80 (0x7fb99a97cb80 in /usr/local/lib/python3.10/dist-packages/torch/lib/libc10.so)
```

Any idea why?

Setup:
Cuda toolkit by nvcc -V and nvidia-smi: 12.6
Pytorch version: 2.9.1+cu126
sageattention version: 2.2.0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error during model prediction: CUDA error: unspecified launch failure #326

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error during model prediction: CUDA error: unspecified launch failure #326

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions