Error: Failed to initialize the TMA descriptor 1

When I use Sage Attention3 in HunyuanVideo, the following error occurs:

My device is DGX Spark GB10.

```
!!!!!!!!!!!!!!!!!!!!!!!!Use Sage Attention!!!!!!!!!!!!!!!!!!!!!!!!
q.shape:  torch.Size([1, 24, 119056, 128]) torch.bfloat16
k.shape:  torch.Size([1, 24, 119056, 128]) torch.bfloat16
v.shape:  torch.Size([1, 24, 119056, 128]) torch.bfloat16
TMA Desc Addr:   0xffffeda9e500
format         7
dim            5
gmem_address   0xe76400000000
globalDim      (128,931,931,24,1)
globalStrides  (4,476672,512,443781632,2305843002684583936)
boxDim         (128,1,1,1,1)
elementStrides (1,1,1,1,1)
interleave     0
swizzle        0
l2Promotion    2
oobFill        0
Error: Failed to initialize the TMA descriptor 1
!!!!!!!!!!!!!!!!!!!!!!!!Sage Attention Done!!!!!!!!!!!!!!!!!!!!!!!!

  0%|                                                                                                | 0/50 [00:05<?, ?it/s]
Traceback (most recent call last):
  File "/home/vincent/Desktop/l00906346/HunyuanVideo/sample_video.py", line 87, in <module>
    main()
  File "/home/vincent/Desktop/l00906346/HunyuanVideo/sample_video.py", line 61, in main
    outputs = hunyuan_video_sampler.predict(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/Desktop/l00906346/HunyuanVideo/hyvideo/inference.py", line 648, in predict
    samples = self.pipeline(
              ^^^^^^^^^^^^^^
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/Desktop/l00906346/HunyuanVideo/hyvideo/diffusion/pipelines/pipeline_hunyuan_video.py", line 991, in __call__
    noise_pred = self.transformer(  # For an input image (129, 192, 336) (1, 256, 256)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/Desktop/l00906346/HunyuanVideo/hyvideo/modules/models.py", line 667, in forward
    img, txt = block(*double_block_args)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/Desktop/l00906346/HunyuanVideo/hyvideo/modules/models.py", line 233, in forward
    self.img_mlp(
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/Desktop/l00906346/HunyuanVideo/hyvideo/modules/mlp_layers.py", line 54, in forward
    x = self.act(x)
        ^^^^^^^^^^^
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vincent/miniconda3/envs/wan2.1_lx/lib/python3.12/site-packages/torch/nn/modules/activation.py", line 816, in forward
    return F.gelu(input, approximate=self.approximate)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: CUDA error: an illegal instruction was encountered
Search for `cudaErrorIllegalInstruction' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
```

This error only appears with specific shapes, such as 720p x 1280p x 129f.
It works fine when I use the 544p x 960p x 129f specification for generation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: Failed to initialize the TMA descriptor 1 #334

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error: Failed to initialize the TMA descriptor 1 #334

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions