Fix/im2col cpu by scyyh11 · Pull Request #75731 · PaddlePaddle/Paddle

scyyh11 · 2025-10-10T22:33:08Z

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

This PR fixes a critical SIGBUS / EXC_BAD_ACCESS crash in phi::funcs::im2col_common on macOS (Apple Silicon) when running convolutions on large tensors (e.g., PaddleOCR). The crash manifests at _platform_memset called from im2col_common, indicating an out-of-bounds write due to incorrect linear index computation.

Root cause: intermediate multiplications for col_idx/im_idx used 32-bit integers. With large dimensions, these products overflow, yielding invalid target pointers and causing OOB writes that reliably crash on macOS/ARM64.

Why the previous fix wasn’t enough: PR #75261 optimized im2col_common by reordering the logic to check bounds before computing/reading im_idx, which addressed one class of UB but did not widen the index arithmetic to 64-bit. Thus, on very large shapes, 32-bit overflow still occurs in the general path and can lead to OOB despite the reordered checks.

Relation to #75716: PR #75716 fixes unsafe memcpy over-reads in the specialized fast path (im2col_sh1sw1dh1dw1ph1pw1) by adding clamping/zero-fill and negative-size protections. That PR is complementary but does not cover the general im2col_common path where this crash occurs.

Fixes included in this PR:

Promote all relevant dimension variables and linear index arithmetic to 64-bit (int64_t) to eliminate overflow.
Preserve the “check bounds first, then compute/use index” sequencing introduced previously.

本 PR 修复了在 macOS（Apple Silicon）上，phi::funcs::im2col_common 在大尺寸卷积（如 PaddleOCR 场景）下出现的 SIGBUS / EXC_BAD_ACCESS 崩溃。崩溃点位于 im2col_common 调用的 _platform_memset，表明线性索引计算错误导致了越界写入。

根因： col_idx / im_idx 的中间乘法使用了 32 位整数，在大尺寸下发生溢出，生成错误的目标指针，进而写出界；在 macOS/ARM64 上会稳定复现崩溃。

为何之前的修复还不够： PR #75261 对 im2col_common 做了“先做越界检查，再计算/读取 im_idx”的顺序优化，解决了其中一类未定义行为，但并未将索引运算提升为 64 位。因此在超大形状下，通用路径仍可能发生 32 位溢出，即使有边界检查也可能因为溢出导致目标地址错误，从而越界写。

与 #75716 的关系： PR #75716 修复了 快速路径（im2col_sh1sw1dh1dw1ph1pw1）中的不安全 memcpy 越界读取问题（加入了大小裁剪、零填充与负数保护），该修复与本 PR 互补，但不覆盖本次崩溃所在的 通用 im2col_common 路径。

本 PR 的修复点：

将相关维度与线性索引计算统一提升为 64 位 (int64_t)，彻底避免溢出。
延续“先检查边界，再计算/使用索引”的安全顺序。

Issue

鲲鹏920 Arm64 使用 OCR V5解析pdf或者图片报错：Segmentation fault is detected by the operating system PaddleOCR#16609

- Add bounds clamping for all memcpy operations in the specialized fast path - Add zero-fill for shortfall cases to ensure complete output tensor coverage - Maintain performance by using memcpy when safe, falling back to element-wise operations only when necessary

…1dh1dw1ph1pw1 - Fix unsafe memcpy in NCHW path when filter_width == 1 - Prevent negative size_t conversion when output_width < plw + prw - Clamp copy size to available source span (im_width) to avoid over-read - Add zero-fill for shortfall cases to ensure complete output coverage

- Convert dimensions to 64-bit integers to avoid overflow during calculations - Update index calculations for col and im arrays to use 64-bit arithmetic - Ensure safe access to tensor data by checking bounds before indexing

paddle-bot · 2025-10-10T22:33:13Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

scyyh11 · 2025-10-11T07:29:42Z

/re-run all-failed

* fix: prevent memcpy over-read in im2col_sh1sw1dh1dw1ph1pw1 NCHW branches - Add bounds clamping for all memcpy operations in the specialized fast path - Add zero-fill for shortfall cases to ensure complete output tensor coverage - Maintain performance by using memcpy when safe, falling back to element-wise operations only when necessary * fix: prevent memcpy over-read in filter_width==1 case of im2col_sh1sw1dh1dw1ph1pw1 - Fix unsafe memcpy in NCHW path when filter_width == 1 - Prevent negative size_t conversion when output_width < plw + prw - Clamp copy size to available source span (im_width) to avoid over-read - Add zero-fill for shortfall cases to ensure complete output coverage * fix: enhance im2col_common to prevent overflow in arithmetic operations - Convert dimensions to 64-bit integers to avoid overflow during calculations - Update index calculations for col and im arrays to use 64-bit arithmetic - Ensure safe access to tensor data by checking bounds before indexing

yuanlehome · 2025-10-15T09:02:06Z

Good job!

* fix: prevent memcpy over-read in im2col_sh1sw1dh1dw1ph1pw1 NCHW branches - Add bounds clamping for all memcpy operations in the specialized fast path - Add zero-fill for shortfall cases to ensure complete output tensor coverage - Maintain performance by using memcpy when safe, falling back to element-wise operations only when necessary * fix: prevent memcpy over-read in filter_width==1 case of im2col_sh1sw1dh1dw1ph1pw1 - Fix unsafe memcpy in NCHW path when filter_width == 1 - Prevent negative size_t conversion when output_width < plw + prw - Clamp copy size to available source span (im_width) to avoid over-read - Add zero-fill for shortfall cases to ensure complete output coverage * fix: enhance im2col_common to prevent overflow in arithmetic operations - Convert dimensions to 64-bit integers to avoid overflow during calculations - Update index calculations for col and im arrays to use 64-bit arithmetic - Ensure safe access to tensor data by checking bounds before indexing Co-authored-by: Bvicii <98971614+scyyh11@users.noreply.github.com>

scyyh11 added 3 commits October 9, 2025 15:33

paddle-bot Bot added the contributor External developers label Oct 10, 2025

scyyh11 marked this pull request as ready for review October 10, 2025 23:14

scyyh11 mentioned this pull request Oct 10, 2025

鲲鹏920 Arm64 使用 OCR V5解析pdf或者图片报错：Segmentation fault is detected by the operating system PaddlePaddle/PaddleOCR#16609

Closed

3 tasks

luotao1 added the HappyOpenSource 快乐开源活动issue与PR label Oct 11, 2025

luotao1 assigned luotao1 and lshpku Oct 11, 2025

lshpku approved these changes Oct 11, 2025

View reviewed changes

lshpku merged commit 91a4c15 into PaddlePaddle:develop Oct 11, 2025
84 of 86 checks passed

scyyh11 deleted the fix/im2col_cpu branch October 13, 2025 10:14

This was referenced Oct 16, 2025

Bus error when using korean language with paddleocr on macOS CLI PaddlePaddle/PaddleOCR#16055

Closed

[WeeklyReports] 2025.10.13~2025.10.24 第二次周报收集 PFCCLab/Starter#710

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/im2col cpu#75731

Fix/im2col cpu#75731
lshpku merged 3 commits intoPaddlePaddle:developfrom
scyyh11:fix/im2col_cpu

scyyh11 commented Oct 10, 2025 •

edited

Loading

Uh oh!

paddle-bot Bot commented Oct 10, 2025

Uh oh!

scyyh11 commented Oct 11, 2025

Uh oh!

Uh oh!

yuanlehome commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

scyyh11 commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Issue

Uh oh!

paddle-bot Bot commented Oct 10, 2025

Uh oh!

scyyh11 commented Oct 11, 2025

Uh oh!

Uh oh!

yuanlehome commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

scyyh11 commented Oct 10, 2025 •

edited

Loading