Skip to content

fix(docparser): prevent exception escape in doConvertFile from aborti…#62

Merged
deepin-bot[bot] merged 1 commit into
linuxdeepin:masterfrom
pppanghu77:master
Jun 23, 2026
Merged

fix(docparser): prevent exception escape in doConvertFile from aborti…#62
deepin-bot[bot] merged 1 commit into
linuxdeepin:masterfrom
pppanghu77:master

Conversation

@pppanghu77

@pppanghu77 pppanghu77 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

…ng host process

  • Move createParser() call and nullptr check inside the try block in doConvertFile() so parser constructor exceptions are also caught
  • Replace throwing std::logic_error with a log + return {} for unsupported extensions, aligning with tryConvertWithTruncation()
  • Prevents std::logic_error from escaping DocParser::convertFile, which previously triggered std::terminate -> abort in callers without try-catch (e.g. dde-file-manager TextExtractor::extract)
  • Add binary test file test_crash_unsupported_ext.dat to reproduce/regress the unsupported-extension crash path

修复(docparser): 防止 doConvertFile 中异常逃逸导致宿主进程崩溃

  • 将 createParser() 调用和 nullptr 检查移入 doConvertFile() 的 try 块内,确保解析器构造期异常也被捕获
  • 不支持的扩展名由抛出 std::logic_error 改为打印日志并返回空串,与 tryConvertWithTruncation() 写法保持一致
  • 防止 std::logic_error 从 DocParser::convertFile 逃逸,原先会在无 try-catch 的调用方(如 dde-file-manager 的 TextExtractor::extract)触发 std::terminate -> abort
  • 新增二进制测试文件 test_crash_unsupported_ext.dat,用于复现/回归不支持扩展名的崩溃路径

Log: 修复 doConvertFile 中异常逃逸导致宿主进程 std::terminate -> abort
的问题,将不支持扩展名的处理由抛异常改为返回空串,并把 createParser 调用纳入 try 保护
Task: https://pms.uniontech.com/task-view-391293.html

Summary by Sourcery

Handle unsupported document file extensions and parser construction errors in DocParser::doConvertFile without crashing the host process.

Bug Fixes:

  • Avoid std::logic_error escaping from DocParser::doConvertFile by returning an empty string for unsupported file extensions instead of throwing.
  • Ensure exceptions thrown during parser creation in doConvertFile are caught by moving createParser and its null check inside the try block.

…ng host process

  - Move createParser() call and nullptr check inside the try block in doConvertFile() so parser constructor exceptions are
  also caught
  - Replace throwing std::logic_error with a log + return {} for unsupported extensions, aligning with
  tryConvertWithTruncation()
  - Prevents std::logic_error from escaping DocParser::convertFile, which previously triggered std::terminate -> abort in
  callers without try-catch (e.g. dde-file-manager TextExtractor::extract)
  - Add binary test file test_crash_unsupported_ext.dat to reproduce/regress the unsupported-extension crash path

  修复(docparser): 防止 doConvertFile 中异常逃逸导致宿主进程崩溃

  - 将 createParser() 调用和 nullptr 检查移入 doConvertFile() 的 try 块内,确保解析器构造期异常也被捕获
  - 不支持的扩展名由抛出 std::logic_error 改为打印日志并返回空串,与 tryConvertWithTruncation() 写法保持一致
  - 防止 std::logic_error 从 DocParser::convertFile 逃逸,原先会在无 try-catch 的调用方(如 dde-file-manager 的
  TextExtractor::extract)触发 std::terminate -> abort
  - 新增二进制测试文件 test_crash_unsupported_ext.dat,用于复现/回归不支持扩展名的崩溃路径

  Log: 修复 doConvertFile 中异常逃逸导致宿主进程 std::terminate -> abort
  的问题,将不支持扩展名的处理由抛异常改为返回空串,并把 createParser 调用纳入 try 保护
Task: https://pms.uniontech.com/task-view-391293.html
@sourcery-ai

sourcery-ai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Adjusts DocParser::doConvertFile to handle unsupported file extensions and parser construction failures without throwing, ensuring exceptions don’t escape and adding logging instead of logic_error, plus adds a binary test input to cover the unsupported-extension crash path.

Sequence diagram for updated DocParser::doConvertFile exception handling

sequenceDiagram
    actor HostProcess
    participant TextExtractor as TextExtractor::extract
    participant DocParser as DocParser::convertFile
    participant doConvertFile as doConvertFile
    participant Factory as createParser
    participant Parser as FileExtension

    HostProcess ->> TextExtractor: extract(filename)
    TextExtractor ->> DocParser: convertFile(filename)
    DocParser ->> doConvertFile: doConvertFile(filename, suffix)
    activate doConvertFile

    doConvertFile ->> Factory: createParser(filename, suffix)
    alt Unsupported_extension
        Factory -->> doConvertFile: nullptr
        doConvertFile ->> doConvertFile: std::cout << "Unsupported file extension"
        doConvertFile -->> DocParser: return {}
        DocParser -->> TextExtractor: return {}
    else Supported_extension
        Factory -->> doConvertFile: Parser instance
        doConvertFile ->> Parser: convert()
        alt [exception in createParser or convert]
            doConvertFile -->> DocParser: return {}
            DocParser -->> TextExtractor: return {}
        else [no exception]
            Parser -->> doConvertFile: m_text
            doConvertFile -->> DocParser: return m_text
            DocParser -->> TextExtractor: return m_text
        end
    end
    deactivate doConvertFile
Loading

File-Level Changes

Change Details Files
Handle unsupported file extensions and parser construction inside a try block so exceptions don’t escape and callers don’t abort.
  • Moved createParser() invocation and resulting nullptr handling into the existing try block within doConvertFile().
  • Changed the unsupported-extension path from throwing std::logic_error to logging a message to std::cout and returning an empty string.
  • Kept document->convert() and return of document->m_text inside the try block to ensure any thrown exceptions are caught at this level.
src/docparser.cpp

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Using std::cout for the unsupported extension path may not integrate with existing logging/diagnostic mechanisms; consider routing this through the project’s logging facility instead.
  • Changing unsupported extensions from throwing to returning {} now makes it indistinguishable from a successfully parsed but empty document; if callers need to differentiate these cases, consider returning an explicit error indicator or status alongside the string.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Using `std::cout` for the unsupported extension path may not integrate with existing logging/diagnostic mechanisms; consider routing this through the project’s logging facility instead.
- Changing unsupported extensions from throwing to returning `{}` now makes it indistinguishable from a successfully parsed but empty document; if callers need to differentiate these cases, consider returning an explicit error indicator or status alongside the string.

## Individual Comments

### Comment 1
<location path="src/docparser.cpp" line_range="294" />
<code_context>
     try {
+        std::unique_ptr<fileext::FileExtension> document = createParser(filename, suffix);
+        if (!document) {
+            std::cout << "Unsupported file extension: " << filename << std::endl;
+            return {};
+        }
</code_context>
<issue_to_address>
**suggestion:** Avoid writing directly to std::cout in a lower-level conversion helper; prefer a logging facility or std::cerr, or make logging configurable.

This ties a low-level parsing routine to user-facing stdout, which can be ignored or reserved for structured output in GUIs, services, or libraries. Prefer a logging facility, std::cerr, or delegating logging to the caller, potentially behind a debug/verbose flag, so normal use doesn’t emit unexpected console output.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src/docparser.cpp
try {
std::unique_ptr<fileext::FileExtension> document = createParser(filename, suffix);
if (!document) {
std::cout << "Unsupported file extension: " << filename << std::endl;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Avoid writing directly to std::cout in a lower-level conversion helper; prefer a logging facility or std::cerr, or make logging configurable.

This ties a low-level parsing routine to user-facing stdout, which can be ignored or reserved for structured output in GUIs, services, or libraries. Prefer a logging facility, std::cerr, or delegating logging to the caller, potentially behind a debug/verbose flag, so normal use doesn’t emit unexpected console output.

@deepin-ci-robot

Copy link
Copy Markdown

deepin pr auto review


★ 总体评分:85分

■ 【总体评价】

代码通过调整异常处理逻辑有效修复了不支持文件格式时的崩溃问题,但使用了不规范的日志输出方式
逻辑正确且提升了系统稳定性,但因使用std::cout代替标准日志框架扣15分

■ 【详细分析】

  • 1.语法逻辑(完全正确)✓
    将createParser的调用及空指针检查移入try块内部,语法无误。返回空字符串{}替代抛出异常,符合C++标准语法,且有效避免了异常逃逸导致的std::terminate调用
    潜在问题:无
    建议:保持当前控制流逻辑
  • 2.代码质量(一般)✕
    注释详尽解释了修改动机,非常有助于后续维护。但在底层解析库中使用std::cout进行错误输出不符合UOS系统级代码的日志规范,会导致日志无法被统一的日志管理系统正确捕获、过滤和分级
    潜在问题:错误信息直接输出到标准输出,可能干扰调用方(如dde-file-manager)依赖stdout获取解析文本的正常数据流
    建议:引入统一的日志宏(如qWarning()或spdlog::warn())替换std::cout,将错误信息输出到标准错误流或系统日志
  • 3.代码性能(高效)✓
    修改仅涉及控制流和作用域的调整,未引入额外的内存分配、锁竞争或复杂的系统调用,性能开销为零
    建议:无需优化
  • 4.代码安全(存在0个安全漏洞)✓
    漏洞对比统计:新增漏洞 0 个,减少漏洞 1 个,持平 0 个
    原代码抛出包含文件名的异常可能被上层未捕获导致进程崩溃(拒绝服务),修改后安全返回空串,消除了此风险。虽然使用std::cout输出了文件名,但在原异常抛出时同样会暴露文件名,故未引入新的信息泄露风险
    建议:在后续优化中将std::cout替换为安全的日志接口,防止文件路径被非授权的终端读取者获取

■ 【改进建议代码示例】

diff --git a/src/docparser.cpp b/src/docparser.cpp
index 203f6a1..8b1b2c3 100644
--- a/src/docparser.cpp
+++ b/src/docparser.cpp
@@ -283,12 +283,18 @@ static std::string doConvertFile(const std::string &filename, std::string suffix
     std::transform(suffix.begin(), suffix.end(), suffix.begin(),
                    [](unsigned char c) { return std::tolower(c); });
 
+    // createParser() 的调用和 nullptr 检查都必须在 try 块内:
+    // 1) 构造期可能抛异常(如某些解析器在打开损坏文件时);
+    // 2) 不支持的扩展名不应抛异常——上层调用方(如 dde-file-manager 的
+    //    TextExtractor::extract)可能没有 try-catch,异常一旦逃逸会触发
+    //    std::terminate -> abort。此处返回空串,与 tryConvertWithTruncation 一致。
     try {
+        std::unique_ptr<fileext::FileExtension> document = createParser(filename, suffix);
+        if (!document) {
+            qWarning() << "Unsupported file extension:" << QString::fromStdString(filename);
+            return {};
+        }
+
         document->convert();
         // Use move semantics to avoid copying
         return std::move(document->m_text);

@pppanghu77

Copy link
Copy Markdown
Contributor Author

deepin pr auto review

★ 总体评分:85分
■ 【总体评价】

代码通过调整异常处理逻辑有效修复了不支持文件格式时的崩溃问题,但使用了不规范的日志输出方式
逻辑正确且提升了系统稳定性,但因使用std::cout代替标准日志框架扣15分

■ 【详细分析】

  • 1.语法逻辑(完全正确)✓
    将createParser的调用及空指针检查移入try块内部,语法无误。返回空字符串{}替代抛出异常,符合C++标准语法,且有效避免了异常逃逸导致的std::terminate调用
    潜在问题:无
    建议:保持当前控制流逻辑
  • 2.代码质量(一般)✕
    注释详尽解释了修改动机,非常有助于后续维护。但在底层解析库中使用std::cout进行错误输出不符合UOS系统级代码的日志规范,会导致日志无法被统一的日志管理系统正确捕获、过滤和分级
    潜在问题:错误信息直接输出到标准输出,可能干扰调用方(如dde-file-manager)依赖stdout获取解析文本的正常数据流
    建议:引入统一的日志宏(如qWarning()或spdlog::warn())替换std::cout,将错误信息输出到标准错误流或系统日志
  • 3.代码性能(高效)✓
    修改仅涉及控制流和作用域的调整,未引入额外的内存分配、锁竞争或复杂的系统调用,性能开销为零
    建议:无需优化
  • 4.代码安全(存在0个安全漏洞)✓
    漏洞对比统计:新增漏洞 0 个,减少漏洞 1 个,持平 0 个
    原代码抛出包含文件名的异常可能被上层未捕获导致进程崩溃(拒绝服务),修改后安全返回空串,消除了此风险。虽然使用std::cout输出了文件名,但在原异常抛出时同样会暴露文件名,故未引入新的信息泄露风险
    建议:在后续优化中将std::cout替换为安全的日志接口,防止文件路径被非授权的终端读取者获取

■ 【改进建议代码示例】

diff --git a/src/docparser.cpp b/src/docparser.cpp
index 203f6a1..8b1b2c3 100644
--- a/src/docparser.cpp
+++ b/src/docparser.cpp
@@ -283,12 +283,18 @@ static std::string doConvertFile(const std::string &filename, std::string suffix
     std::transform(suffix.begin(), suffix.end(), suffix.begin(),
                    [](unsigned char c) { return std::tolower(c); });
 
+    // createParser() 的调用和 nullptr 检查都必须在 try 块内:
+    // 1) 构造期可能抛异常(如某些解析器在打开损坏文件时);
+    // 2) 不支持的扩展名不应抛异常——上层调用方(如 dde-file-manager 的
+    //    TextExtractor::extract)可能没有 try-catch,异常一旦逃逸会触发
+    //    std::terminate -> abort。此处返回空串,与 tryConvertWithTruncation 一致。
     try {
+        std::unique_ptr<fileext::FileExtension> document = createParser(filename, suffix);
+        if (!document) {
+            qWarning() << "Unsupported file extension:" << QString::fromStdString(filename);
+            return {};
+        }
+
         document->convert();
         // Use move semantics to avoid copying
         return std::move(document->m_text);

不是qt库,默认采用std输出

@deepin-ci-robot

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: max-lvs, pppanghu77

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pppanghu77

Copy link
Copy Markdown
Contributor Author

/forcemerge

@deepin-bot

deepin-bot Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

This pr force merged! (status: unstable)

@deepin-bot deepin-bot Bot merged commit 3838e80 into linuxdeepin:master Jun 23, 2026
21 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants