[Fix] fix IndexElementwiseGet kernel CUDA error(700) on 0-size input by DanielSun11 · Pull Request #78251 · PaddlePaddle/Paddle

DanielSun11 · 2026-03-10T11:11:17Z

PR Types

Operator Mechanism

PR Category

Bug fixes

Description

问题背景

当使用高级整数索引（list-of-list）对第一维为 0 的 Tensor 执行 __getitem__ 时，触发 CUDA error(700)（非法内存访问）或 CPU segfault：

import paddle
x = paddle.zeros([0, 5, 4, 3], dtype='complex128')
out = x[[[2, 3, 4], [1, 2, 5]]]  # GPU: CUDA error(700); CPU: SIGSEGV
out.backward(paddle.zeros_like(out))  # CPU backward: SIGSEGV

根因分析

调用链：__getitem__ → tensor__getitem_dygraph → ApplyGetitem → AdvancedIndex → index_elementwise_get_ad_func → IndexElementwiseGetKernel

AdvancedIndex 构造函数将被索引的维度用索引形状替换得到 src_sizes，如对 x.shape=[0,5,4,3] 用 [[2,3,4],[1,2,5]]（shape=[2,3]）索引维度 0，得到 src_sizes = [2, 3, 5, 4, 3]（numel=90）。

Forward kernel（GPU）：out->numel() = 90 != 0，原有 if (out->numel() == 0) return; 不触发；x.numel() = 0，x.data<T>() 返回 nullptr；kernel 访问 nullptr + offset → CUDA error(700)

Forward kernel（CPU）：同上，CPU 实现也缺少此检查

Backward kernel（GPU）：x_grad->numel() = 0，GpuMemsetAsync 填零后继续执行，访问空指针

Backward kernel（CPU）：x_grad->numel() = 0，dev_ctx.Alloc<T>(x_grad) 返回空指针，EigenVector<T>::Flatten(*x_grad) 对空指针操作 → SIGSEGV

修复方案

在四个 kernel 文件中增加对输入为空的早退检查：

GPU forward (index_elementwise_get_kernel.cu)：当 x.numel() == 0 时，用 GpuMemsetAsync 将输出填零并 return
CPU forward (index_elementwise_get_kernel.cc)：当 x.numel() == 0 时，用 memset 将输出填零并 return
GPU backward (index_elementwise_get_grad_kernel.cu)：当 x_grad->numel() == 0 时直接 return
CPU backward (index_elementwise_get_grad_kernel.cc)：在 Alloc<T> 之后、Eigen 操作之前加 if (x_grad->numel() == 0) return;

新增单测

test/legacy_test/test_index_elementwise.py：新增 TestIndexElementwiseGet0SizeInput，覆盖 complex128、bool、float32、float64、int64、float16 等 dtype，包含正负索引及一维索引等场景（9 个测试方法）
test/legacy_test/test_index_elementwise_grad.py：新增 TestIndexElementwiseGet0SizeInputGrad，覆盖 float32、float64 及负索引的反向场景（3 个测试方法）

所有 32 个新增测试方法均通过（CPU + GPU）。

是否引起精度变化

否

paddle-bot · 2026-03-10T11:11:24Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

codecov-commenter · 2026-03-10T15:09:33Z

Codecov Report

❌ Patch coverage is 25.00000% with 3 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@ae907b8). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...le/phi/kernels/cpu/index_elementwise_get_kernel.cc	33.33%	2 Missing ⚠️
...i/kernels/cpu/index_elementwise_get_grad_kernel.cc	0.00%	1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (25.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop   #78251   +/-   ##
==========================================
  Coverage           ?   25.00%           
==========================================
  Files              ?        2           
  Lines              ?        4           
  Branches           ?        0           
==========================================
  Hits               ?        1           
  Misses             ?        3           
  Partials           ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add x_grad->numel() == 0 early-return in CPU backward kernel, analogous to the GPU backward kernel fix. When x.numel()==0, Alloc<T> returns a null pointer; using EigenVector::Flatten on it causes SIGSEGV.

…DanielSun11/Paddle into fix/index-elementwise-get-0size

DanielSun11 · 2026-04-01T12:14:47Z

#78453 已有类似解决

[Fix] fix IndexElementwiseGet kernel crash on 0-size input tensor

04260e4

DanielSun11 added 3 commits March 20, 2026 15:22

[Fix] fix IndexElementwiseGetGradKernel CPU segfault on 0-size input

81505bb

Add x_grad->numel() == 0 early-return in CPU backward kernel, analogous to the GPU backward kernel fix. When x.numel()==0, Alloc<T> returns a null pointer; using EigenVector::Flatten on it causes SIGSEGV.

fix ut

043132c

Merge branch 'fix/index-elementwise-get-0size' of https://github.com/…

8fffc02

…DanielSun11/Paddle into fix/index-elementwise-get-0size

DanielSun11 closed this Apr 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] fix IndexElementwiseGet kernel CUDA error(700) on 0-size input#78251

[Fix] fix IndexElementwiseGet kernel CUDA error(700) on 0-size input#78251
DanielSun11 wants to merge 4 commits intoPaddlePaddle:developfrom
DanielSun11:fix/index-elementwise-get-0size

DanielSun11 commented Mar 10, 2026 •

edited

Loading

Uh oh!

paddle-bot bot commented Mar 10, 2026

Uh oh!

codecov-commenter commented Mar 10, 2026 •

edited

Loading

Uh oh!

DanielSun11 commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DanielSun11 commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Types

PR Category

Description

问题背景

根因分析

修复方案

新增单测

是否引起精度变化

Uh oh!

paddle-bot bot commented Mar 10, 2026

Uh oh!

codecov-commenter commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

DanielSun11 commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DanielSun11 commented Mar 10, 2026 •

edited

Loading

codecov-commenter commented Mar 10, 2026 •

edited

Loading