Skip to content

Conversation

@lchen2331
Copy link
Contributor

@lchen2331 lchen2331 commented Jan 21, 2026

Fix issue: #2667
The issue is caused by oneMKL receiving NaN inputs, which can lead to out-of-bounds memory access and convert NaN to 0, causing the singular check to fail. A NaN check is added before oneMKL computation, and NaN outputs are returned when NaN inputs are detected to align with cuda results.

Testing:

pytest -sv test/regressions/test_linalg_solve_nan.py

Output:

test/regressions/test_linalg_solve_nan.py::TestLinalgSolveNaN::test_solve_batch_mixed_nan PASSED
test/regressions/test_linalg_solve_nan.py::TestLinalgSolveNaN::test_solve_cayley_transform PASSED
test/regressions/test_linalg_solve_nan.py::TestLinalgSolveNaN::test_solve_nan_variants PASSED

Copilot AI review requested due to automatic review settings January 21, 2026 03:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds NaN input validation to prevent false singular matrix errors in MKL linear algebra operations. When NaN values are detected in inputs, the functions now return NaN outputs with appropriate default values instead of passing invalid data to oneMKL, which could cause out-of-bounds memory access.

Changes:

  • Add NaN checks in lu_solve_mkl and lu_factor_mkl functions
  • Return NaN-filled outputs when NaN inputs are detected
  • Include necessary headers for isnan and arange operations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

Performance outliers, please check!

  • 🟡 [80%, 90%), may be fluctuations
Category Model Target vs. Baseline [Eager] Target vs. Baseline [Inductor]
torchbench_bfloat16_training resnet18 0.94745 0.899606

Copy link
Contributor

@CuiYifeng CuiYifeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add test cases for input with nan, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants