Skip to content

[Bug] Potential DoS via Unbounded zlib.decompress (Zip Bomb Vulnerability) in Document Parser #22101

Description

@QiuYucheng2003

Bug Description

A security vulnerability exists in the document parsing logic (specifically within the get_text_from_section method). Compressed data streams are being decompressed using zlib.decompress(data, -15) without any size limitations or memory boundaries.

This exposes the application to Decompression Bomb (Zip Bomb) attacks. An attacker can provide a maliciously crafted document (e.g., a manipulated HWP file) with a tiny compressed payload that expands to several gigabytes. When the parser attempts to read this file, it will cause a massive, instantaneous memory allocation, leading to an Out of Memory (OOM) crash and Denial of Service (DoS) for the host system or agent workflow.

Code Location:
base.py -> get_text_from_section method:
unpacked_data = (
zlib.decompress(data, -15) if self.is_compressed(load_file) else data
)

Version

main branch

Steps to Reproduce

Steps to Reproduce:
(Identified via static code analysis, theoretical reproduction steps below)

  1. Supply a malicious composite document containing a highly compressed, repetitive byte-stream section (Zip bomb payload).

  2. Invoke the reader/parser to process the document.

  3. Observe the zlib.decompress function attempting to load the entire uncompressed payload into memory at once, instantly exhausting available RAM and triggering the OS OOM killer.

Relevant Logs/Tracebacks

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageIssue needs to be triaged/prioritized

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions