-
Notifications
You must be signed in to change notification settings - Fork 18
Description
Describe the bug
When parsing .msg files in Python 3.9, I get UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 0: invalid continuation byte.
To Reproduce
Steps to reproduce the behavior:
- In Python 3.9, with msg_parser installed
msg_parser -i path_to_my_msgfile.msg -e .- Observe error
Expected behavior
No error!
Desktop (please complete the following information):
- OS: Mac OS X
- Python: 3.9.13
- Version: msg_parser @ d16260d
Additional context
I was able to fix the problem by removing https://github.com/vikramarsid/msg_parser/blob/master/msg_parser/cli.py#L40; after that, everything worked fine. Evidently, the argparse FileType argument tries to open the file as utf-8, which it is not. The problem can also be fixed by changing the line to specify that the file is binary, type=FileType(mode="rb"),
Happy to submit a PR, but I cannot test if the type=Filetype() line is expected to do something in particular. As with #303 , I cannot submit any test files because all of my .msg files are confidential.