Fixed integer overflow in make_cute_packed_stride batch stride computation by a123pal · Pull Request #3307 · NVIDIA/cutlass

a123pal · 2026-06-07T06:34:00Z

Problem: make_cute_packed_stride computes the batch stride as get<0>(shape_MKL) * get<1>(shape_MKL) where both operands are int32. For large matrix dimensions (e.g. 49152 × 65536 = 3,221,225,472) this overflows int32 before the cast to IntT occurs, producing a negative batch stride.

Fix: Cast each operand to IntT before multiplying so the multiplication occurs in 64-bit space. Applied to both affected overloads.

Fixes #3269

…ation

Fixed integer overflow in make_cute_packed_stride batch stride comput…

5fee62f

…ation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed integer overflow in make_cute_packed_stride batch stride computation#3307

Fixed integer overflow in make_cute_packed_stride batch stride computation#3307
a123pal wants to merge 1 commit into
NVIDIA:mainfrom
a123pal:main

a123pal commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

a123pal commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant