lxml-html-clean has <base> tag injection through default Cleaner configuration

Summary

The <base> tag passes through the default Cleaner configuration. While page_structure=True removes html, head, and title tags, there is no specific handling for <base>, allowing an attacker to inject it and hijack relative links on the page.

Details

The <base> tag is not currently in the page_structure kill set. Even though the specification says <base> must be inside <head>, browsers accept <base> tags outside of the head.

If an attacker injects a <base> tag, it changes the base URL for all relative URLs on the page (links, images, scripts) to a domain controlled by the attacker.

PoC

from lxml_html_clean import clean_html

# The base tag is preserved in the output
result = clean_html('<base href="http://evil.com/"><a href="/account">Account</a>')
print(result)
# Output: <div><base href="http://evil.com/">...<a href="/account">Account</a></div>

Impact

The injection of a <base> tag allows an attacker to hijack the resolution of all relative URLs on the page. This results in three critical attack vectors:

Phishing & Redirection: Attackers can redirect user navigation (e.g., <a href="/login">) and form submissions (e.g., <form action="/auth">) to an attacker-controlled domain, effectively stealing credentials or sensitive data without the user realizing they have left the legitimate site.
Cross-Site Scripting (XSS): If the victim application loads JavaScript files using relative paths (e.g., <script src="assets/app.js">), the browser will attempt to fetch the script from the attacker's domain. This upgrades the vulnerability from HTML injection to full Stored XSS.
Defacement: Relative references to images (<img>) and stylesheets (<link>) will be loaded from the attacker's server, allowing for UI redressing or defacement.

References

frenzymadness published to fedora-python/lxml_html_clean Mar 2, 2026

Published to the GitHub Advisory Database Mar 2, 2026

Reviewed Mar 2, 2026

Published by the National Vulnerability Database Mar 5, 2026

Last updated Mar 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package

Affected versions

Patched versions

Description

Summary

Details

PoC

Impact

References

Severity

CVSS overall score

CVSS v3 base metrics

CVSS v3 base metrics

EPSS score

Exploit Prediction Scoring System (EPSS)

Weaknesses

Improper Encoding or Escaping of Output

CVE ID

GHSA ID

Source code

Credits

Uh oh!