Skip to content

[Fix] fix IndexElementwiseGet kernel CUDA error(700) on 0-size input#78251

Closed
DanielSun11 wants to merge 4 commits intoPaddlePaddle:developfrom
DanielSun11:fix/index-elementwise-get-0size
Closed

[Fix] fix IndexElementwiseGet kernel CUDA error(700) on 0-size input#78251
DanielSun11 wants to merge 4 commits intoPaddlePaddle:developfrom
DanielSun11:fix/index-elementwise-get-0size

Conversation

@DanielSun11
Copy link
Copy Markdown
Contributor

@DanielSun11 DanielSun11 commented Mar 10, 2026

PR Types

Operator Mechanism

PR Category

Bug fixes

Description

问题背景

当使用高级整数索引(list-of-list)对第一维为 0 的 Tensor 执行 __getitem__ 时,触发 CUDA error(700)(非法内存访问)或 CPU segfault:

import paddle
x = paddle.zeros([0, 5, 4, 3], dtype='complex128')
out = x[[[2, 3, 4], [1, 2, 5]]]  # GPU: CUDA error(700); CPU: SIGSEGV
out.backward(paddle.zeros_like(out))  # CPU backward: SIGSEGV

根因分析

调用链:__getitem__tensor__getitem_dygraphApplyGetitemAdvancedIndexindex_elementwise_get_ad_funcIndexElementwiseGetKernel

AdvancedIndex 构造函数将被索引的维度用索引形状替换得到 src_sizes,如对 x.shape=[0,5,4,3][[2,3,4],[1,2,5]](shape=[2,3])索引维度 0,得到 src_sizes = [2, 3, 5, 4, 3](numel=90)。

Forward kernel(GPU)out->numel() = 90 != 0,原有 if (out->numel() == 0) return; 不触发;x.numel() = 0x.data<T>() 返回 nullptr;kernel 访问 nullptr + offset → CUDA error(700)

Forward kernel(CPU):同上,CPU 实现也缺少此检查

Backward kernel(GPU)x_grad->numel() = 0GpuMemsetAsync 填零后继续执行,访问空指针

Backward kernel(CPU)x_grad->numel() = 0dev_ctx.Alloc<T>(x_grad) 返回空指针,EigenVector<T>::Flatten(*x_grad) 对空指针操作 → SIGSEGV

修复方案

在四个 kernel 文件中增加对输入为空的早退检查:

  1. GPU forward (index_elementwise_get_kernel.cu):当 x.numel() == 0 时,用 GpuMemsetAsync 将输出填零并 return
  2. CPU forward (index_elementwise_get_kernel.cc):当 x.numel() == 0 时,用 memset 将输出填零并 return
  3. GPU backward (index_elementwise_get_grad_kernel.cu):当 x_grad->numel() == 0 时直接 return
  4. CPU backward (index_elementwise_get_grad_kernel.cc):在 Alloc<T> 之后、Eigen 操作之前加 if (x_grad->numel() == 0) return;

新增单测

  • test/legacy_test/test_index_elementwise.py:新增 TestIndexElementwiseGet0SizeInput,覆盖 complex128、bool、float32、float64、int64、float16 等 dtype,包含正负索引及一维索引等场景(9 个测试方法)
  • test/legacy_test/test_index_elementwise_grad.py:新增 TestIndexElementwiseGet0SizeInputGrad,覆盖 float32、float64 及负索引的反向场景(3 个测试方法)

所有 32 个新增测试方法均通过(CPU + GPU)。

是否引起精度变化

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Mar 10, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 25.00000% with 3 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@ae907b8). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...le/phi/kernels/cpu/index_elementwise_get_kernel.cc 33.33% 2 Missing ⚠️
...i/kernels/cpu/index_elementwise_get_grad_kernel.cc 0.00% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (25.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #78251   +/-   ##
==========================================
  Coverage           ?   25.00%           
==========================================
  Files              ?        2           
  Lines              ?        4           
  Branches           ?        0           
==========================================
  Hits               ?        1           
  Misses             ?        3           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add x_grad->numel() == 0 early-return in CPU backward kernel, analogous
to the GPU backward kernel fix. When x.numel()==0, Alloc<T> returns a
null pointer; using EigenVector::Flatten on it causes SIGSEGV.
@DanielSun11
Copy link
Copy Markdown
Contributor Author

#78453 已有类似解决

@DanielSun11 DanielSun11 closed this Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants