Fix sparse mask handling in softmax kernel by mangguo321 · Pull Request #33814 · openvinotoolkit/openvino

mangguo321 · 2026-01-26T09:05:16Z

Details:

Fix sparse mask handling in softmax kernel. In the sparse attention path, the sparse mask caused some blocks to be skipped, so those blocks are not written by the GEMM kernel, as a result, the corresponding regions in the output buffer remain uninitialized and their contents may decode to NAN/Inf values.
In this PR, we overwrite the skipped regions with -FLT_MAX to prevent NaN propagation and avoid incorrect computations in downstream kernels

Tickets:

CVS-179625

rkazants

please implement tests

mangguo321 · 2026-01-27T05:25:42Z

please implement tests

We did some performance and accuracy test, the results can be found in this JIRA ticket.https://jira.devtools.intel.com/browse/CVS-179625

liubo-intel

Hi, @mangguo321 : from my understanding, your changes are to fix NaN data issue.
for Xattention cases, these changes make sense to skip the sparse block, and LGTM.
but I think when we have time, better to find out why these v_a/a[i] contain NaN data? Since the v_a/a[i] values serve as input data for this kernel, they are expected to be finite under normal conditions, unless there was a computational error during the previous calculation or a mistake during data loading.

mangguo321 · 2026-01-27T06:22:07Z

Hi @liubo-intel The input to softmax kernel is the output of QK GEMM. In the sparse attention path, the sparse mask caused some blocks to be skipped, so those blocks are not written by the GEMM kernel, as a result, the corresponding regions in the output buffer remain uninitialized and their contents may decode to NAN/Inf values.

rkazants · 2026-01-27T06:58:29Z

No appropriate PR description, no JIRA ticket, no tests

mangguo321 · 2026-01-27T08:40:19Z

No appropriate PR description, no JIRA ticket, no tests

Hi @rkazants We updated the description with PR details. The JIRA ticket is already referenced in the description. We tested this change with qwen2-7b-instruct and llama-3.2-3b-instruct, the accuracy issue reported in the ticket is resolved and no performance regression was observed. The test results is in the JIRA ticket. Please let me know if any additional information is needed.

mangguo321 · 2026-01-28T01:38:39Z

Hi @maxnick, could you please take a review? Thanks!

zhangYiIntel

LGTM, NaN + anything still equals NaN, the setting approach is much better!

maxnick · 2026-01-29T10:14:26Z

@mangguo321 , could you please cover your changes with a single layer tests via extending the existing test configurations or developing a new one?

mangguo321 · 2026-01-30T08:16:54Z

@mangguo321 , could you please cover your changes with a single layer tests via extending the existing test configurations or developing a new one?

The softmax kernel unit test was added to cover the code changes in this PR.

maxnick · 2026-01-30T09:18:18Z

@rkazants , the dedicated unit tests were added.

… on Intel CPUs.

mangguo321 · 2026-02-04T01:30:33Z

@rkazants @maxnick If there are no further concerns, could you please remove the "do not merge" label? This PR is required to address the XAttention accuracy issue. Thanks a lot!

no more concern rearding PR description and tests

### Details: - *Fix sparse mask handling in softmax kernel. In the sparse attention path, the sparse mask caused some blocks to be skipped, so those blocks are not written by the GEMM kernel, as a result, the corresponding regions in the output buffer remain uninitialized and their contents may decode to NAN/Inf values.* - *In this PR, we overwrite the skipped regions with -FLT_MAX to prevent NaN propagation and avoid incorrect computations in downstream kernels* ### Tickets: - *[CVS-179625](https://jira.devtools.intel.com/browse/CVS-179625)*

Fix sparse mask handling in softmax kernel

5ae6469

mangguo321 requested review from a team as code owners January 26, 2026 09:05

github-actions bot added the category: CPU OpenVINO CPU plugin label Jan 26, 2026

yuxu42 requested review from liubo-intel and zhangYiIntel January 27, 2026 01:33

rkazants added the pr: needs tests PR needs tests updating label Jan 27, 2026

rkazants previously requested changes Jan 27, 2026

View reviewed changes

rkazants requested a review from maxnick January 27, 2026 04:56

liubo-intel approved these changes Jan 27, 2026

View reviewed changes

rkazants added the do not merge label Jan 27, 2026

zhangYiIntel approved these changes Jan 28, 2026

View reviewed changes

maxnick added this to the 2026.1 milestone Jan 29, 2026

mangguo321 added 2 commits January 30, 2026 15:44

Add softmax kernel unit test with sparse mask

cdd68e6

Fix typo

2bcf7f8

maxnick approved these changes Jan 30, 2026

View reviewed changes

maxnick removed the pr: needs tests PR needs tests updating label Jan 30, 2026

Skip the test on non-x86_64 platforms as xattention is supported only…

6301843

… on Intel CPUs.

github-actions bot added the category: build OpenVINO cmake script / infra label Feb 1, 2026

rkazants removed the do not merge label Feb 4, 2026

maxnick added this pull request to the merge queue Feb 4, 2026

Merged via the queue into openvinotoolkit:master with commit 65b105a Feb 4, 2026
234 of 236 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix sparse mask handling in softmax kernel#33814

Fix sparse mask handling in softmax kernel#33814
maxnick merged 4 commits intoopenvinotoolkit:masterfrom
mangguo321:mang/fix_softmax_sparse

mangguo321 commented Jan 26, 2026 •

edited

Loading

Uh oh!

rkazants left a comment

Uh oh!

mangguo321 commented Jan 27, 2026

Uh oh!

liubo-intel left a comment

Uh oh!

mangguo321 commented Jan 27, 2026

Uh oh!

rkazants commented Jan 27, 2026

Uh oh!

mangguo321 commented Jan 27, 2026 •

edited

Loading

Uh oh!

mangguo321 commented Jan 28, 2026

Uh oh!

zhangYiIntel left a comment

Uh oh!

maxnick commented Jan 29, 2026

Uh oh!

mangguo321 commented Jan 30, 2026

Uh oh!

maxnick commented Jan 30, 2026

Uh oh!

mangguo321 commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

mangguo321 commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details:

Tickets:

Uh oh!

rkazants left a comment

Choose a reason for hiding this comment

Uh oh!

mangguo321 commented Jan 27, 2026

Uh oh!

liubo-intel left a comment

Choose a reason for hiding this comment

Uh oh!

mangguo321 commented Jan 27, 2026

Uh oh!

rkazants commented Jan 27, 2026

Uh oh!

mangguo321 commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mangguo321 commented Jan 28, 2026

Uh oh!

zhangYiIntel left a comment

Choose a reason for hiding this comment

Uh oh!

maxnick commented Jan 29, 2026

Uh oh!

mangguo321 commented Jan 30, 2026

Uh oh!

maxnick commented Jan 30, 2026

Uh oh!

mangguo321 commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mangguo321 commented Jan 26, 2026 •

edited

Loading

mangguo321 commented Jan 27, 2026 •

edited

Loading