Fix autograd gradient normalization for permuted and broadcast tensors #182

ppaleja · 2025-11-16T20:31:03Z

Summary

Fixes #68 - resolves an index out-of-bounds panic in autograd's add_grad function that occurred when computing gradients for tensors with non-monotonic permute indices and fake (broadcasted) dimensions.

Remarks

I do admit these fixes seem extremely hacky to me. Would appreciate review and even if they work, it might be evidence for the brittleness of ShapeTracker/autograd logic in its current state.
I also came across another issue with permute logic during my debugging, but will make a separate issue for that.

Commits

The fix is structured across 4 commits for easier cherrypicking:

1. Add tests for identifying bug in issue #68 (`8c1b5a7`)

Reorganized test module structure for better organization
Added comprehensive test cases that reproduce the issue, including:
test_add_grad_decreasing_idx: The original failing case from issue #68
test_add_grad_high_rank: Tests gradient computation with higher-rank tensors
test_add_grad_preserves_unfaked_dim: Tests preservation of non-broadcast dimensions

Status: All new tests fail at this commit (as expected)

2. Refactor gradient normalization logic (`777d4b2`)

Extracted gradient normalization into normalize_grad_for_input helper function
No behavioral changes, pure refactoring
Added unit tests specifically for the normalization logic
Makes the subsequent fixes clearer and more testable

Status: Tests still fail

3. Fix add_grad fake-dim removal (Partial fix) (`6e898fd`)

Core fix: Changed from using logical indices (fwd.shape.indexes) to physical indices when identifying and removing fake dimensions
Now correctly collects fake underlying indices: (0..fwd.shape.len()).filter(|&idx| fwd.shape.fake[idx])
Sorts indices in descending order before removal to prevent index shifting issues
Why this works: Eliminates the mixing of logical/physical index spaces that caused incorrect axis access

Status: Fixes the OOB panic from the original issue, but test_add_grad_preserves_unfaked_dim still fails because rank alignment isn't handled yet

4. Fix broadcasted axes in autograd gradient norm (Complete fix) (`4057c7e`)

Handles the case where incoming gradients have fewer axes than the forward tensor
Inserts missing axes at positions where the forward tensor has fake dimensions before performing undo-permute logic
Uses grad.expand() to ensure gradient shape aligns with forward tensor shape

Remarks: I think this is the hackiest part of the changes, but it does seem necessary to pass test_add_grad_preserves_unfaked_dim (which I verified separately is valid logic in pytorch). Unclear if there is a better way.

Status: All tests pass

…odule to reflect this grouping.

… helper function. Added unit tests to reflect internal normalization logic resulting in issue luminal-ai#68.

Fixes incorrect undo-expand logic in autograd that mixed logical vs underlying indices and led to OOB panics / corrupted shapes when forward tensors were permuted and broadcast. Changes Updated `normalize_grad_for_input` in autograd.rs. - Instead of iterating `fwd.shape.indexes` and treating those values as reduction axes, the code now: - Collects fake underlying indices with `(0..fwd.shape.len()).filter(|&idx| fwd.shape.fake[idx])`. - Sorts those indices descending. - For each index: call `SumReduce(idx)`, then `grad.shape.remove_dim(idx)` and `grad.shape = grad.shape.contiguous()`. - Left the undo-permute mapping in place (next commit will fix)) Why this fixes luminal-ai#68 - The previous implementation mixed logical and physical index spaces: it could test `fwd.shape.fake` with a logical axis or call `SumReduce` on the wrong axis after a permute, which caused out-of-bounds accesses and incorrect gradient shapes. Tests & notes - The function still assumes the gradient rank has been aligned with the forward tensor before undo-permute; this causes `test_add_grad_preserves_unfaked_dim` to fail.

Previously, normalize_grad_for_input could panic or produce incorrect gradients when the incoming gradient had fewer logical axes than the forward tensor, especially after broadcast and permute operations. This was due to attempting to remove or reduce axes that did not exist in the gradient shape. The fix inserts missing axes into the gradient at positions where the forward tensor has fake (broadcasted) dimensions, before undoing permutes and applying sum-reductions. This ensures shape alignment and prevents out-of-bounds errors. The change resolves panics in test_add_grad_high_rank and produces correct gradients for test_add_grad_preserves_unfaked_dim and similar cases.

ppaleja added 5 commits November 16, 2025 11:17

Add tests for identifying bug in issue luminal-ai#68. Refactor test m…

8c1b5a7

…odule to reflect this grouping.

Refactor gradient normalization logic into normalize_grad_for_input…

777d4b2

… helper function. Added unit tests to reflect internal normalization logic resulting in issue luminal-ai#68.

fix formatting with cargo fmt

4796d6e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix autograd gradient normalization for permuted and broadcast tensors #182

Fix autograd gradient normalization for permuted and broadcast tensors #182

Uh oh!

ppaleja commented Nov 16, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix autograd gradient normalization for permuted and broadcast tensors #182

Are you sure you want to change the base?

Fix autograd gradient normalization for permuted and broadcast tensors #182

Uh oh!

Conversation

ppaleja commented Nov 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Remarks

Commits

1. Add tests for identifying bug in issue #68 (8c1b5a7)

2. Refactor gradient normalization logic (777d4b2)

3. Fix add_grad fake-dim removal (Partial fix) (6e898fd)

4. Fix broadcasted axes in autograd gradient norm (Complete fix) (4057c7e)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ppaleja commented Nov 16, 2025 •

edited

Loading

1. Add tests for identifying bug in issue #68 (`8c1b5a7`)

2. Refactor gradient normalization logic (`777d4b2`)

3. Fix add_grad fake-dim removal (Partial fix) (`6e898fd`)

4. Fix broadcasted axes in autograd gradient norm (Complete fix) (`4057c7e`)