Skip to content

Closing document tags added, when code blocks contain closing body or html tag #254

@PRGfx

Description

@PRGfx

Version(s) affected

5.1.1

Description

Codeblocks with closing </body> or </html> tags will add closing tags to the resulting document.

<pre><code>&lt;/body&gt;</code></pre>
```
</body>
```

</body>

or

```
</html>
```

</body></html>

The issue occurs only once, i.e. having multiple closing html tags will still return the whole document + only one </body></html>.

I guess it's somehow about how the HTML is saved. I can work around this with first replacing these tags with placeholders before conversion and replacing them back, but as long as I'm not sure, why this happens, it might occur for the next best element.

How to reproduce

echo -e '<pre><code>&lt;/body&gt;</code></pre>' | vendor/bin/html-to-markdown
echo -e '<pre><code>&lt;/html&gt;</code></pre>' | vendor/bin/html-to-markdown
<?php
include 'vendor/autoload.php';
$converter = new \League\HTMLToMarkdown\HtmlConverter();
echo $converter->convert('<pre><code>&lt;/html&gt;</code></pre>');

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions