Skip to content

Every line is transformed as header element in markdown #74

@saravmajestic

Description

@saravmajestic

Describe the bug
Thanks for this library. Very much helpful. I am seeing a weird issue. When parsing this PDF, each and every line is transformed as a header element instead of sentences/paragraphs. This issue is happening with original repo as well. Tested here: https://pdf2md.morethan.io/

To Reproduce
Steps to reproduce the behavior:

  1. call @opendocsg/pdf2md in cmd line with the above file as input
  2. Check the output

Expected behavior
Some of the text in the pdf, for ex: Selecting the “right” amount of information to include in a summary is a difficult task. A good... should not be treated as header

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: macOs
  • Browser NA
  • Version NA

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Since this issue exists in original repo, it will be great if you can point me how to resolve this issue. Appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions