Skip to content

Incorrect constraint conversion from html to markdown #22

@dangne

Description

@dangne

Hi,

I noticed that in fetch_dataset.py, the include_sup_sub setting is not enabled, which leads to incorrect parsing of the constraints.

For example, given the following code:

import html2text

h = html2text.HTML2Text()
h.ignore_links = True
h.ignore_images = True
h.ignore_emphasis = True
# h.include_sup_sub is False by default

h.handle("<code>0 &lt;= grid[i][j] &lt;= 10<sup>5</sup></code>")

The output will be:

`0 <= grid[i][j] <= 105`\n\n

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions