Skip to content

Add UnescapedHtml linter#656

Open
wilfison wants to merge 1 commit into
sds:mainfrom
wilfison:feature/unescaped-html-linter-550
Open

Add UnescapedHtml linter#656
wilfison wants to merge 1 commit into
sds:mainfrom
wilfison:feature/unescaped-html-linter-550

Conversation

@wilfison

Copy link
Copy Markdown
Contributor

Closes #550

Summary

Adds a new UnescapedHtml linter that flags HAML's unescaped-output markers, which bypass HTML escaping. As noted in #550, just like raw, html_safe, and h() in Rails, these markers make it easy to accidentally introduce XSS holes when the output includes user-controlled data — for example:

!= "Username: <strong>#{user.name}</strong>"

RuboCop's Rails/OutputSafety cop catches raw/html_safe/safe_concat, but it never sees these because they are a HAML construct, not Ruby. This linter fills that gap.

What is flagged

The linter targets every HAML marker that emits unescaped output of dynamic content:

HAML Flagged? Reason
= value no escaped output
~ value no escaped, whitespace-preserved output
!= value yes unescaped output
!~ value yes unescaped, whitespace-preserved output
! text #{value} yes unescaped dynamic plain text
%tag!= value yes unescaped tag content
%tag!~ value yes unescaped, preserved tag content
! static text no static plain text — no injection vector
%p= a != b no the Ruby != operator, not an unescape marker

The distinction is intentional: only unescaped output of dynamic content is reported, since static markup carries no injection risk.

Implementation notes

  • Detection is based on the node's source marker, not @value[:escape_html]:
    haml-lint parses with HTML escaping disabled, so both = and != report
    escape_html: false and cannot be told apart that way.
  • For :script nodes the check is anchored at the start of the source
    (/\A\s*!/), so it covers !=, !~, and the dynamic plain-text ! form
    while never matching a Ruby != operator inside an escaped expression.
  • For :tag nodes a new TagNode#unescape_html? helper inspects the content
    marker that follows the tag name and attributes. The existing attribute-source
    scan was refactored into a single memoized parsed_attributes_source that now
    also exposes the trailing inline_marker_source, keeping tag-source parsing in
    one place.
  • No autocorrect: rewriting != to = would change behavior and could escape
    HTML the author intentionally left raw, so this is detection only.

Enabled by default

The linter ships enabled in config/default.yml.

Testing

bundle exec rspec spec/haml_lint/linter/unescaped_html_spec.rb spec/haml_lint/tree
bundle exec rubocop

Flag HAML's unescaped-output markers (`!=`, `!~`, and dynamic plain-text
`!`) at script and tag level. These bypass HTML escaping and make it easy
to introduce XSS holes when output includes user-controlled data.

Detection uses the source marker (anchored so the Ruby `!=` operator is
not flagged); only unescaped output of dynamic content is reported.
Enabled by default, detection only.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Linter suggestion - Avoid Unescape HTML

1 participant