Skip to content

Unicode ecodeError while parsing the PDF files.  #17

Open
@adityardesai

Description

@adityardesai

Hi

I am using NLTKRest server to parse few of the PDF files from Polar Trec Data and get the required NER quantities. But for most of the PDF files I am seeing the following error from the REST server.

"UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128) // Werkzeug Debugger "

Command used is
curl -X POST -d "PDF TEXT in STRING" http://localhost:8888/nltk.

Error file is attached as well.
nltkrest.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions