Skip to content

Potential Arbitrary File Write Vulnerability due to Insufficient LLM Output Sanitization #1473

@glmgbj233

Description

@glmgbj233

Description:

Upon reviewing the code related to file writing in superagi/tools/code/write_code.py and superagi/resource_manager/file_manager.py, a potential security vulnerability has been identified. The system allows Large Language Models (LLMs) to generate both filenames and file content, which are then written to the file system. While some basic sanitization is performed on the filename, it appears to be insufficient to prevent arbitrary file writes.

Vulnerability Details:

  1. superagi/tools/code/write_code.py (#L81-107):
    The WriteCodeTool extracts filenames and code content from the LLM's output. The filename undergoes a basic sanitization step (re.sub(r'[<>"|?*]', "", match.group(1))) to remove certain special characters and a check for leading/trailing non-alphanumeric characters. However, this sanitization might not be comprehensive enough to prevent path traversal attacks (e.g., ../) or other malicious filename constructions that could lead to writing files outside the intended directory.

    The code content (match.group(2)) is used directly without any apparent sanitization or validation before being written to the file.

  2. superagi/resource_manager/file_manager.py (#L48-61):
    The write_file method in FileManager constructs the final_path using ResourceHelper.get_agent_write_resource_path or ResourceHelper.get_resource_path. While get_agent_write_resource_path () handles agent-specific paths and directory creation, it ultimately concatenates the root_dir with the file_name provided by the WriteCodeTool.

    If the file_name generated by the LLM (even after its limited sanitization) contains path traversal sequences (e.g., ../../../../etc/passwd), an attacker could potentially trick the system into writing arbitrary content to arbitrary locations on the server, leading to:

    • Overwriting critical system files.
    • Creating executable files in sensitive directories.
    • Defacing the application.
    • Achieving remote code execution (if combined with other vulnerabilities).

Impact:
Arbitrary file write can lead to severe consequences, including remote code execution, denial of service, and data corruption.

Proposed Solution:

  1. Strict Path Validation: Implement more robust path validation in ResourceHelper.get_agent_write_resource_path and ResourceHelper.get_resource_path to explicitly disallow path traversal sequences (.., absolute paths, etc.) and ensure that the generated final_path always resides within the intended, sandboxed resource directory.
  2. Whitelist Filename Characters: Instead of blacklisting characters, consider whitelisting allowed characters for filenames to further restrict malicious inputs.
  3. Content Validation (if applicable): Depending on the expected content, consider implementing content validation or sanitization, especially if the written files are later executed or served to users.

Steps to Reproduce (Conceptual):

  1. Craft an LLM prompt that encourages the model to generate a filename containing path traversal sequences (e.g., ../../../../tmp/malicious_script.sh).
  2. Provide a malicious script as the file content.
  3. Observe if the file is written outside the intended resource directory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions