Skip to content

Extracting patterns of <br> within <p> yields html5ever warning #21

@shouya

Description

@shouya

Hi, I noticed some html snippets evoke html5ever warnings. After chasing down the cause I found a minimal pattern that reproduces the issue:

  let text = String::from("<p><br/>a<br/>a</p>");
  let mut text = std::io::Cursor::new(text);
  let product = readability::extractor::extract(&mut text, &url).unwrap();

and here's the warning message:

2024-06-19T14:08:40.634683Z  WARN html5ever::serialize: node with weird namespace Atom('' type=static)
2024-06-19T14:08:40.634723Z  WARN html5ever::serialize: node with weird namespace Atom('' type=static)

Note that if I remove the last a from the string (i.e. <p><br/>a<br/></p>), the warning is gone completely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions