Add comprehensive stride coverage tests for non-contiguous tensor inputs#761
Add comprehensive stride coverage tests for non-contiguous tensor inputs#761jhelferty-nv wants to merge 8 commits into
Conversation
|
@jhelferty-nv the tests never passed on this one. do you want to merge it and see if you can solve it? or close PR if it's no longer relevant. |
…hader-slang#761) Test that PyTorch tensor views created by slicing can be correctly marshalled to Slang fixed-size array parameters (float[N]), covering both basic acceptance and gradient parity with pure PyTorch. Made-with: Cursor
…arams Cover the high-risk gaps in non-trivial tensor view handling: - float3 vector params with prefix/offset/strided sliced tensors - RWTensor write-back correctness with sliced output tensors - Tensor<float,2> with transposed (non-contiguous) input tensors - Gradient parity for float3 params with sliced tensor views Made-with: Cursor
…oat[5] - Transpose tests for float[3] and float3 params (non-unit stride in the trailing dimension that maps to array/vector components) - float[5] with prefix/offset/strided slices (verifies array marshalling generalizes beyond the float[3] case) - Tensor<float,2> with column-prefix, column-offset, and column-strided views (extends the existing transpose-only coverage) - Gradient parity tests for both float[3] and float3 with transposed inputs Made-with: Cursor
Cover remaining non-contiguous memory layout gaps: - TensorView: sliced input reads, sliced output write-back, sliced add - DiffTensorView: sliced input/output, backward pass with sliced inputs - WTensor: transpose write-back - numpy: non-contiguous arrays for float and float3 params - Remove redundant test_array5_parameter_slice Made-with: Cursor
Cover remaining medium+ risk non-contiguous memory patterns: - Negative strides (flip): all 1D and 2D param test files - Diagonal views (stride = sum of dims): 1D param tests - 3D permute (both dims non-unit stride): Tensor<float,2> tests - Zero-stride broadcast (expand): float[3], float3, Tensor<float,2> - Numpy flip and broadcast cases Made-with: Cursor
torch.flip() always returns a contiguous copy (PyTorch does not support negative strides), so flip cases were just re-testing the contiguous path. The C++ numpy marshalling layer explicitly rejects non-contiguous arrays, so those tests were targeting a missing feature. Made-with: Cursor
2f6a05a to
9a7d2be
Compare
📝 WalkthroughWalkthroughAdded comprehensive test coverage for non-contiguous, sliced, and transposed tensor handling in SlangPy across four test modules. Tests verify parameter binding, gradient parity, TensorView behavior, and write-back semantics. A new Slang Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
I've taken another stab at it, starting from scratch but inspired by the previous attempt. Reusing the PR since the idea is the same, but I'm building on tests that were checked in since. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
slangpy/tests/slangpy_tests/test_difftensorview.py (1)
171-183: Exercise sliced gradient storage in this backward case.
xis a view, butx_gradis a fresh contiguous tensor. If the backward path mishandles non-contiguous gradient destinations, this test still passes. Make the gradient buffer a slice of a larger tensor and assert the untouched positions stay zero.Suggested adjustment
- x_grad = torch.zeros(3, device="cuda", dtype=torch.float32) + x_grad_full = torch.zeros(source_size, device="cuda", dtype=torch.float32) + x_grad = slicer(x_grad_full) output = torch.zeros(3, device="cuda", dtype=torch.float32) output_grad = torch.ones(3, device="cuda", dtype=torch.float32) module.diff_square.bwds(diff_pair(x, x_grad), diff_pair(output, output_grad)) torch.cuda.synchronize() - expected_grad = 2.0 * x - assert torch.allclose( - x_grad, expected_grad, atol=1e-5 - ), f"Expected grad {expected_grad}, got {x_grad}" + expected_full = torch.zeros_like(x_full) + slicer(expected_full).copy_(2.0 * x) + assert torch.allclose( + x_grad_full, expected_full, atol=1e-5 + ), f"Expected grad {expected_full}, got {x_grad_full}"Based on learnings: When adding autograd support for new access patterns, must update
find_torch_tensors()for is_input determination,TorchAutoGradHook.backward()for gradient flow direction, and add tests.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slangpy/tests/slangpy_tests/test_difftensorview.py` around lines 171 - 183, The test currently uses a fresh contiguous x_grad so it won't catch bugs where backward writes only into contiguous buffers; change the test to make x_grad a slice of a larger tensor (e.g., big_x_grad = torch.zeros(5, device="cuda"); x_grad = big_x_grad[1:4]) and after calling module.diff_square.bwds(diff_pair(x, x_grad), diff_pair(output, output_grad)) + torch.cuda.synchronize(), assert that big_x_grad[1:4] == expected_grad and that big_x_grad[0] and big_x_grad[4] remain zero; also ensure analogous slicing for output_grad if needed. When adding autograd support for new access patterns, remember to update find_torch_tensors() (is_input determination) and TorchAutoGradHook.backward() to handle gradient flow direction for sliced/non-contiguous destinations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@slangpy/tests/slangpy_tests/test_torchintegration.py`:
- Around line 549-566: The test parameter `name` in
test_tensor_view(device_type, name, view_factory) is unused; either remove it
from the parametrization or reference it in the test body to silence the lint
warning—e.g., include it in an assertion or diagnostic (use the `name` string
when asserting non-contiguity or in a failure message) so the symbol `name` is
read, or delete `name` from the test signature and related parametrization
entries (affecting test_tensor_view and the parametrized test data that supplies
`name`).
---
Nitpick comments:
In `@slangpy/tests/slangpy_tests/test_difftensorview.py`:
- Around line 171-183: The test currently uses a fresh contiguous x_grad so it
won't catch bugs where backward writes only into contiguous buffers; change the
test to make x_grad a slice of a larger tensor (e.g., big_x_grad =
torch.zeros(5, device="cuda"); x_grad = big_x_grad[1:4]) and after calling
module.diff_square.bwds(diff_pair(x, x_grad), diff_pair(output, output_grad)) +
torch.cuda.synchronize(), assert that big_x_grad[1:4] == expected_grad and that
big_x_grad[0] and big_x_grad[4] remain zero; also ensure analogous slicing for
output_grad if needed. When adding autograd support for new access patterns,
remember to update find_torch_tensors() (is_input determination) and
TorchAutoGradHook.backward() to handle gradient flow direction for
sliced/non-contiguous destinations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e8411be5-544e-44fc-bcad-8711aa998353
📒 Files selected for processing (5)
slangpy/tests/slangpy_tests/test_difftensorview.pyslangpy/tests/slangpy_tests/test_pytorch_gradient_parity.pyslangpy/tests/slangpy_tests/test_tensorview.pyslangpy/tests/slangpy_tests/test_torchintegration.pyslangpy/tests/slangpy_tests/test_torchintegration.slang
| def test_tensor_view( | ||
| device_type: DeviceType, | ||
| name: str, | ||
| view_factory: Callable[[torch.device], torch.Tensor], | ||
| ): | ||
| """ | ||
| Test that non-contiguous 2D tensor views work correctly when bound to | ||
| Tensor<float,2> / WTensor<float,2> parameters. | ||
|
|
||
| Covers transposed, column-sliced (prefix and offset), and column-strided | ||
| views, all of which produce non-contiguous memory layouts. | ||
| """ | ||
| module = load_test_module(device_type) | ||
|
|
||
| dev = torch.device("cuda") | ||
| a = view_factory(dev) | ||
| b = view_factory(dev) | ||
| assert not a.is_contiguous() |
There was a problem hiding this comment.
Use name in the body or drop it from the parametrized args.
Ruff is already flagging Line 551 because name is never read. Folding it into the non-contiguity assertion keeps the readable test ids and clears the warning.
Minimal fix
- assert not a.is_contiguous()
+ assert not a.is_contiguous(), f"{name} should create a non-contiguous view"🧰 Tools
🪛 Ruff (0.15.4)
[warning] 551-551: Unused function argument: name
(ARG001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@slangpy/tests/slangpy_tests/test_torchintegration.py` around lines 549 - 566,
The test parameter `name` in test_tensor_view(device_type, name, view_factory)
is unused; either remove it from the parametrization or reference it in the test
body to silence the lint warning—e.g., include it in an assertion or diagnostic
(use the `name` string when asserting non-contiguity or in a failure message) so
the symbol `name` is read, or delete `name` from the test signature and related
parametrization entries (affecting test_tensor_view and the parametrized test
data that supplies `name`).
ccummingsNV
left a comment
There was a problem hiding this comment.
I've updated the branch. Assuming tests pass, this is good to go.
Add tests verifying that non-contiguous PyTorch tensor views are correctly
handled when passed to Slang functions through the functional API. Covers
the marshalling and vectorization paths for multiple parameter types and
view operations.
Parameter types tested:
View operations tested:
Test categories:
positions are written
Motivated by #387
Summary by CodeRabbit
Release Notes
New Features
scaled_sumfunction for scaled summation operations on value arrays.Tests