Skip to content

[GSan][FPSan] Fix sanitizer correctness and performance#10561

Draft
jeffniu-openai wants to merge 2 commits into
mainfrom
codex/gsan-pr10548
Draft

[GSan][FPSan] Fix sanitizer correctness and performance#10561
jeffniu-openai wants to merge 2 commits into
mainfrom
codex/gsan-pr10548

Conversation

@jeffniu-openai

@jeffniu-openai jeffniu-openai commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

jeffniu-openai added a commit that referenced this pull request Jun 11, 2026
PR description written by Codex

## Summary
- Load shared-memory TCGen MMA operands directly instead of
round-tripping through global scratch.

## Stack
Merge bottom-up:
- [x] #10472
- [ ] #10473 (this PR)
- [ ] #10527
- [ ] #10532
- [ ] #10533
- [ ] #10542
- [ ] #10548
- [ ] #10561
- [ ] #10559
@jeffniu-openai jeffniu-openai force-pushed the jeffniu/fpsan-payload-sign-clear branch from df64447 to 54b6147 Compare June 11, 2026 21:47
jeffniu-openai added a commit that referenced this pull request Jun 11, 2026
Rebase draft PR #10561 onto the updated stack.
@jeffniu-openai jeffniu-openai force-pushed the jeffniu/fpsan-payload-sign-clear branch from 54b6147 to e22d426 Compare June 11, 2026 21:58
@jeffniu-openai jeffniu-openai force-pushed the jeffniu/fpsan-payload-sign-clear branch from e22d426 to 40e6e4e Compare June 11, 2026 22:05
@jeffniu-openai jeffniu-openai force-pushed the jeffniu/fpsan-payload-sign-clear branch from 40e6e4e to 8c951c6 Compare June 11, 2026 23:24
jeffniu-openai added a commit that referenced this pull request Jun 12, 2026
PR description written by Codex

Load shared MMA operands directly into their result layouts and reuse
existing scale shadows, avoiding redundant layout conversions and scale
snapshots, and also load accumulator directly into its MMA layout.

## Stack
Merge bottom-up:
- [x] #10472
- [x] #10473
- [ ] #10527 (this PR)
- [ ] #10532
- [ ] #10533
- [ ] #10542
- [ ] #10548
- [ ] #10561
- [ ] #10559
@jeffniu-openai jeffniu-openai force-pushed the jeffniu/fpsan-payload-sign-clear branch from 8c951c6 to 2af7783 Compare June 12, 2026 01:02
jeffniu-openai added a commit that referenced this pull request Jun 12, 2026
PR description written by Codex

Reland of increased fpsan test coverage now that fpsan is faster

## Stack
Merge bottom-up:
- [x] #10472
- [x] #10473
- [x] #10527
- [ ] #10532 (this PR)
- [ ] #10533
- [ ] #10542
- [ ] #10548
- [ ] #10561
- [ ] #10559
@jeffniu-openai jeffniu-openai force-pushed the jeffniu/fpsan-payload-sign-clear branch from 2af7783 to 74c3d89 Compare June 12, 2026 01:56
jeffniu-openai added a commit that referenced this pull request Jun 12, 2026
PR description written by Codex

Optimize i8 decomposition by reordering the dots and eagerly combining
into the accumulator-on-the-fly to minimize register pressure, and
include a basic subtiling heuristic determined experimentally

## Stack
Merge bottom-up:
- [x] #10472
- [x] #10473
- [x] #10527
- [x] #10532
- [ ] #10533 (this PR)
- [ ] #10542
- [ ] #10548
- [ ] #10561
- [ ] #10559
@jeffniu-openai jeffniu-openai force-pushed the jeffniu/fpsan-payload-sign-clear branch from 74c3d89 to 93f572b Compare June 12, 2026 02:43
jeffniu-openai added a commit that referenced this pull request Jun 12, 2026
PR description written by Codex

Fix FPSAN TMEM emulation for initialized scratch synchronization,
predicated stores, reduction loads, and scale-copy reinterpret views.

## Stack
Merge bottom-up:
- [x] #10472
- [x] #10473
- [x] #10527
- [x] #10532
- [x] #10533
- [ ] #10542 (this PR)
- [ ] #10548
- [ ] #10561
- [ ] #10559
@jeffniu-openai jeffniu-openai force-pushed the jeffniu/fpsan-payload-sign-clear branch from 93f572b to 071d5f5 Compare June 12, 2026 03:55
Base automatically changed from jeffniu/fpsan-payload-sign-clear to main June 12, 2026 04:41
jeffniu-openai added a commit that referenced this pull request Jun 12, 2026
PR description written by Codex

Remove redundant payload sign clears before masked multiplies so NVPTX
cannot fold them to `abs.f32`, which quiets signaling NaNs. Add a
reduced one-warp regression that fails bitwise on PR #10542’s base and
passes with the fix.

## Stack
Merge bottom-up:
- [x] #10472
- [x] #10473
- [x] #10527
- [x] #10532
- [x] #10533
- [x] #10542
- [ ] #10548 (this PR)
- [ ] #10561
- [ ] #10559
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant