[深度对齐] paddle.nn.functional.gelu by cszdrg · Pull Request #75605 · PaddlePaddle/Paddle

cszdrg · 2025-09-29T08:09:32Z

PR Category

Operator Mechanism

PR Types

New features

Description

pytorch公式计算位置：pytorch/aten/src/ATen/native/cuda/ActivationGeluKernel.cu

官网公式描述：

paddle修改前后公式分别为：

前向近似计算

$\displaystyle kAlpha = 2/\sqrt{\pi} \times 1/\sqrt{2}$
$\displaystyle kBeta = kAlpha \times GELUCONSTANT \times 3$
$\displaystyle cubeX = x^3$
$\displaystyle tanhOut = \tanh(kAlpha \times (GELUCONSTANT \times cubeX + x))$
$$ansA = \tfrac{1}{2}\Big[(1 + tanhOut) + (1 - tanhOut^2)\times (x\times kAlpha + cubeX\times kBeta)\Big]$$

---转化为--->

$\displaystyle kAlpha = \sqrt{2} \times 2/\sqrt{\pi} \times 0.5$
$\displaystyle kBeta = GELUCONSTANT$
$\displaystyle xSeq = x^2$
$\displaystyle cubeX = x^3$
$\displaystyle tanhOut = \tanh(kAlpha \times (kBeta \times cubeX + x))$
$$ansB = \tfrac{1}{2}(1 + tanhOut) + \tfrac{1}{2}x(1 - tanhOut^2)\times (kAlpha \times (1 + 3kBeta xSeq))$$

前向不做近似计算

$kBeta = \left(\frac{2}{\sqrt{\pi}}\right)\left(\frac{1}{\sqrt{2}}\right)\times 0.5$
$cdf = \mathrm{normcdf}(x)$
$pdf = \exp\big(-0.5\times x\times x\big)\times kBeta$

---转化为--->

$kBeta = \left(\frac{2}{\sqrt{\pi}}\right)\left(\frac{1}{\sqrt{2}}\right)\times 0.5$
$kAlpha = \frac{1}{\sqrt{2}}$
cdf = 0.5 * (1 + erf(x * kAlpha))
$pdf = \exp\big(-0.5\times x\times x\big)\times kBeta$

paddle 修改后公式为：

反向近似计算

$kAlpha = sqrt(1/2) * (2 / sqrt(pi))$
$tanhout = tanh(kAlpha * x * (1 + GELUCONSTANT * x * x))$
$out = x * 0.5 * (1 + tanhout)$

---转化为--->

$kAlpha = sqrt(2) * (2 / sqrt(pi)) * 0.5$
$tanhout = tanh(kAlpha * (x + GELUCONSTANT * (x * x * x)))$
$out = 0.5 * x * (1 + tanhout)$

反向不做近似计算

$out = x * normcdf(x)$

---转化为--->

$kAlpha = 1 / sqrt(2)$
$out = x * 0.5 * (1 + erf(x * kAlpha))$

修改gelu的前向和反向计算公式使得计算过程与pytorch保持对齐，修改后所有测试均可通过

paddle-bot · 2025-09-29T08:09:38Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

cszdrg · 2025-09-30T01:43:47Z

/re-run all-failed

A-nnonymous

LGTM

zrr1999 · 2025-10-13T05:20:29Z

可以在描述里把具体公式的修改情况列一下

wanghuancoder

LGTM

* CallScalarFunction uses the dtype of 'self' as the type of 'other' when opotype is 'div'(#75237) * LinspaceKernel uses the dtype of 'self' as the type of 'step' when tensor is floating (#75238) * align LinspaceKernel * update meta * update gpu kernel * fix LinspaceKernelInner * improve kernel * fix CudaSigmoidGradFunctor and CudaSiluGradFunctor (#75341) * Softplus accuracy and torch alignment 1 (#75363) * [Precision Depth Alignment] paddle.tan reverse calculation: dx = dout *(1 + tan(x)^2) (#75335) * Tan reverse calculation: dx = dout *(1 + tan(x)^2) * [Precision Depth Alignment] Add support for CUDNN to paddle.nn.functional.grid_sample to align with torch accuracy. (#75355) * accuracy_stable_grid_sample * fix * correlation supports big tensor (#75383) * fix * fix test * fix * paddle.tanh Grad and torch alignment (float16) (#75454) * [Precision Depth Alignment] paddle.sin and paddle.cos aligns with torch precision. (#75503) * accuracy_stable_sin * accuracy_stable_cos * [深度对齐]Divide (#75379) * fix * fix * fix * fix * fix * [Precision Depth Alignment] fix precision for float16 of paddle.tan backward (#75525) * fix precision for float16 of paddle.tan backward * fix else branch of CudaTanGradFunctor * [Precision Depth Alignment] fix precision for paddle.expm1 (#75549) * accuracy_stable_expm1 * fix * Bigtensor排查修复[Paddle/paddle/phi/kernels/funcs] (#75523) * fix * fix * [Precision Depth Alignment] fix beta and threshold of paddle.nn.functional.softplus to double (#75426) * fix beta and threshold of Softplus to double * fix test_softplus_activation_fuse_pass v1 * fix test_activation_zero * fix flaot of SoftplusDoubleGradKernel to double * add op_patches for softplus * add yaml for ops/yaml/legacy * fix infershape/operator for FLOAT64 * fix * add SoftPlusOpTranscriber * fix * fix * fix1 * fix2 * fix coverage * fix coverage2 * fix (#75605) * [深度对齐] dot (#75717) * fix * fix * fix dcu * [Precision Depth Alignment] paddle.log aligns with torch precision (#75799) * accuracy_stable_log * accuracy_stable_log * fix * fix * fix * fix * fix5 * [Precision Depth Alignment] fix eps of paddle.logit from float to double (#75816) * accuracy_stable_logit * add LogitOpTranscriber * fix coverage * fix 0yaml * [Precision Depth Alignment] paddle.log_sigmoid (#75898) * accuracy_stable_log_sigmoid * fix test_activation_stride_op.py * [Precision Depth Alignment] Modify the negative_slope parameter of the paddle.nn.functional.leaky_relu API to double (#75547) * [big tensor] Paddle/paddle/phi/kernels/funcs gpuBigtensor (#75856) * fix funcs * gpu * fix * fix * 修改PADDLE_ENFORCE信息 * fix cpu error * fix dcu * fix dcu * fix * [Fix] log sigmoid complex (#75953) * feature: Add specialized LogSigmoidFunctor and CudaLogSigmoidFunctor for complex numbers This commit introduces specialized implementations of LogSigmoidFunctor and CudaLogSigmoidFunctor to handle complex number inputs. The new implementations utilize direct formulas for improved accuracy and stability in calculations involving complex types. * refactor: Optimize LogSigmoidFunctor and CudaLogSigmoidFunctor for complex types by caching exp(-x) to reduce redundant computations. This change enhances performance while maintaining accuracy in calculations. * refactor: modified the formula in LogSigmoidFunctor to make it numerical stable --------- Co-authored-by: Zhan Rongrui <46243324+zrr1999@users.noreply.github.com> Co-authored-by: 正在学习 <62892980+cszdrg@users.noreply.github.com> Co-authored-by: Bvicii <98971614+scyyh11@users.noreply.github.com>

fix

c4c3c9b

A-nnonymous approved these changes Oct 13, 2025

View reviewed changes

zrr1999 approved these changes Oct 13, 2025

View reviewed changes

wanghuancoder approved these changes Oct 13, 2025

View reviewed changes

wanghuancoder merged commit 31f801d into PaddlePaddle:develop Oct 13, 2025
101 of 106 checks passed

SigureMo pushed a commit to cattidea/Paddle that referenced this pull request Oct 14, 2025

fix (PaddlePaddle#75605)

97535b0

zhengshengning pushed a commit to zhengshengning/Paddle that referenced this pull request Oct 24, 2025

fix (PaddlePaddle#75605)

8df3224

wanghuancoder mentioned this pull request Oct 24, 2025

[Cherry-pick Fleety_12] Bigtensor and api precision #76023

Closed

zhengshengning pushed a commit to zhengshengning/Paddle that referenced this pull request Oct 24, 2025

fix (PaddlePaddle#75605)

f831668

zhengshengning mentioned this pull request Oct 24, 2025

[Cherry-pick Fleety_12] Bigtensor and api precision #76028

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[深度对齐] paddle.nn.functional.gelu#75605

[深度对齐] paddle.nn.functional.gelu#75605
wanghuancoder merged 1 commit intoPaddlePaddle:developfrom
cszdrg:gelu

cszdrg commented Sep 29, 2025 •

edited

Loading

Uh oh!

paddle-bot Bot commented Sep 29, 2025

Uh oh!

cszdrg commented Sep 30, 2025

Uh oh!

A-nnonymous left a comment

Uh oh!

zrr1999 commented Oct 13, 2025

Uh oh!

wanghuancoder left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

cszdrg commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Uh oh!

paddle-bot Bot commented Sep 29, 2025

Uh oh!

cszdrg commented Sep 30, 2025

Uh oh!

A-nnonymous left a comment

Choose a reason for hiding this comment

Uh oh!

zrr1999 commented Oct 13, 2025

Uh oh!

wanghuancoder left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cszdrg commented Sep 29, 2025 •

edited

Loading