Skip to content

Issue with HTML Sanitization: Improper Handling of <div> Tag Inside <table> #342

@sumitkumar1110

Description

@sumitkumar1110

Hi,
We are using the this library in Zimbra to sanitize customer-generated HTML content in emails. During this process, we encountered an issue where a

tag inside a tag causes improper sanitization. Specifically:

  1. The sanitizer closes the
tag before the
tag and reopens it after the
.
  • The
    tag does not close where it originally should; instead, it closes just before the end of the HTML document.
    It seems that the sanitizer uses a stack to manage tags and the
    tag remains in the stack until all other tags are processed, causing it to close at the end of the document.
  • Before Sanitization
    <table align="center" border="0" cellpadding="0" cellspacing="0" class="full-width-mobile" role="presentation" style="width:500px;background:#fff;" width="500"> <div class="mobilecontent mobilecontent-gmail" style="mso-hide:all;display:none;max-height:0px;overflow:hidden;padding:0;height:0;"> <!--[if !mso]><!--> <tr style="padding-top:0; padding-bottom:0;"> <td class="show-mobile-table-cell gmail-td-hide" style="padding-top:0;padding-bottom:0;display:none;mso-hide:all;"><table class="show-mobile-table gmail-table-hide" align="center" border="0" cellpadding="0" cellspacing="0" role="presentation" style="width:100%;display:none;mso-hide:all;"> <tr> <td class="show-mobile-table-cell pad-horiz-mobile gmail-td" colspan="2" style="display:none;mso-hide:all;background:#eee;color:#444;padding-top:20px;padding-bottom:20px;font-family:Roboto, Arial, sans-serif; font-size:20px; -mso-line-height-rule:exactly;line-height:24px;font-weight:500;width:100%">Ważna uwaga dotycząca usługi: wsparcie dla sprzętu wygasło </td> </tr> <tr> <td class="show-mobile-table-cell" valign="bottom" style="vertical-align:bottom;border-bottom: 1px solid #01447C;padding-bottom:10px;padding-top:45px;padding-left:15px;display:none;mso-hide:all;"> <a href="https://urllink/path/sfasdfas" target="_blank"><img src="https://images.com/LOGO_A.jpg" alt="new logo" width="171" style="width:171px; display: block;" border="0"></a> </td> <td class="show-mobile-table-cell" valign="bottom" align="right" style="border-bottom: 1px solid #01447C;vertical-align:bottom;color:#01447C;font-family:Roboto, Arial, sans-serif; font-size:12px; -mso-line-height-rule:exactly;line-height:24px;font-weight:500;text-align:right;padding-top:0;padding-bottom:13px;padding-right:15px;display:none;mso-hide:all;"><span class="link">Numer konta: 1702669724 </span></td> </tr> </table></td> </tr> <!--<![endif]--> </div> </table>

    After sanitization

    `

      <table><tbody><tr style="padding-top:0;padding-bottom:0"><td class="show-mobile-table-cell gmail-td-hide" style="padding-top:0;padding-bottom:0;display:none"><table class="show-mobile-table gmail-table-hide" align="center" border="0" cellpadding="0" cellspacing="0" style="width:100%;display:none"><tbody><tr><td class="show-mobile-table-cell pad-horiz-mobile gmail-td" colspan="2" style="display:none;background:#eee;color:#444;padding-top:20px;padding-bottom:20px;font-family:&#39;roboto&#39; , &#39;arial&#39; , sans-serif;font-size:20px;line-height:24px;font-weight:500;width:100%">Ważna uwaga dotycząca usługi: wsparcie dla sprzętu wygasło </td></tr><tr><td class="show-mobile-table-cell" valign="bottom" style="vertical-align:bottom;border-bottom:1px solid #01447c;padding-bottom:10px;padding-top:45px;padding-left:15px;display:none">
            <a href="https://urllink/path/sfasdfas" target="_blank" rel="nofollow noopener noreferrer"><img alt="new logo width="171" style="width:171px;display:block" border="0" dfsrc="https://images.com/LOGO_A.jpg" /></a>
      </td><td class="show-mobile-table-cell" valign="bottom" align="right" style="border-bottom:1px solid #01447c;vertical-align:bottom;color:#01447c;font-family:&#39;roboto&#39; , &#39;arial&#39; , sans-serif;font-size:12px;line-height:24px;font-weight:500;text-align:right;padding-top:0;padding-bottom:13px;padding-right:15px;display:none"><span class="link">Numer konta: 1702669724
    


    `

    At the end of the document the missing div tag is getting closed there's two div tag in the HTML document both get's closed at  end like this
    
    ` </div></div></td></tr></tbody></table>
    

    `

    In div tag we have a an attribute style="display:none" so if the div tag does not closes properly can cause contents of the body to not get displayed.

    It will be great if someone can guide me on how to handle this situation or it can be considered as an enhancement or bugfix.

    Metadata

    Metadata

    Assignees

    No one assigned

      Labels

      No labels
      No labels

      Type

      No type

      Projects

      No projects

      Milestone

      No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions