Problem converting .docx file with to_html method and example from documentation

Hello, 

I followed the example from your documentation to convert my docx file to html:
```
from pydocx import PyDocX
# Pass in a path
html = PyDocX.to_html('file.docx')
# Pass in a file object
html = PyDocX.to_html(open('file.docx', 'rb'))
# Pass in a file-like object
from cStringIO import StringIO
buf = StringIO()
with open('file.docx') as f:
    buf.write(f.read())
html = PyDocX.to_html(buf)
```
As I am using Python 3.6 I changed cStringIO to io. However I always have the same issue with my .docx file at the line buf.write(f.read())

```
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-24-598c617210d8> in <module>()
     10 buf = StringIO()
     11 with open('file.docx') as f:
---> 12     buf.write(f.read())
     13 html = PyDocX.to_html(buf)

~/anaconda3/lib/python3.6/codecs.py in decode(self, input, final)
    319         # decode input (taking the buffer into account)
    320         data = self.buffer + input
--> 321         (result, consumed) = self._buffer_decode(data, self.errors, final)
    322         # keep undecoded input until the next call
    323         self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 14: invalid start byte
```
It is the case with all the .docx files I tried. Does anybody can suggest what is wrong ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Problem converting .docx file with to_html method and example from documentation #251

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Problem converting .docx file with to_html method and example from documentation #251

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions