Skip to content

Disable NVIDIA_TF32_OVERRIDE by default for better precision.#75476

Merged
wanghuancoder merged 1 commit intoPaddlePaddle:developfrom
A-nnonymous:fix_default_gemm_prec
Oct 13, 2025
Merged

Disable NVIDIA_TF32_OVERRIDE by default for better precision.#75476
wanghuancoder merged 1 commit intoPaddlePaddle:developfrom
A-nnonymous:fix_default_gemm_prec

Conversation

@A-nnonymous
Copy link
Copy Markdown
Contributor

@A-nnonymous A-nnonymous commented Sep 23, 2025

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

默认不开启cublas中的tf32 策略。
修改前flag NVIDIA_TF32_OVERRIDE在CUBLAS等库里默认为1,会引入tf32来加速fp32,牺牲精度。
现与torch行为对齐,参考链接如下:

监控结果:
8c80be03a73f52bbf7c6f4fc9

避免FP32 gemm或不带bias的linear 中,13位尾数裁剪带来的大幅精度损失。后续需要进行更细粒度的类型管控。

pcard-93348

@A-nnonymous A-nnonymous changed the title Disable CUBLAS TF32 for default for better precision. Disable NVIDIA_TF32_OVERRIDE by default for better precision. Sep 23, 2025
@A-nnonymous
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@2dfc418). Learn more about missing BASE report.

Additional details and impacted files
@@             Coverage Diff             @@
##             develop    #75476   +/-   ##
===========================================
  Coverage           ?   100.00%           
===========================================
  Files              ?         1           
  Lines              ?         2           
  Branches           ?         0           
===========================================
  Hits               ?         2           
  Misses             ?         0           
  Partials           ?         0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@A-nnonymous
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

2 similar comments
@A-nnonymous
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@A-nnonymous
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

Copy link
Copy Markdown
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wanghuancoder wanghuancoder merged commit fcf3c3f into PaddlePaddle:develop Oct 13, 2025
119 of 131 checks passed
SigureMo pushed a commit to cattidea/Paddle that referenced this pull request Oct 14, 2025
A-nnonymous added a commit to A-nnonymous/Paddle that referenced this pull request Oct 17, 2025
phlrain pushed a commit that referenced this pull request Oct 19, 2025
#75907)

* Revert "Disable CUBLAS TF32 for default for better precision. (#75476)"

This reverts commit fcf3c3f.

* Update __init__.py

test=document_fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants