Open
Description
Thank you for a very thought-out tool! Currently evaluating it for keeping my 400+GB, 50k-file archive safe(r).
While doing so, came across this exception:
Traceback (most recent call last):
File "/home/user/.local/bin/pff", line 8, in <module>
sys.exit(main())
File "/home/user/.local/lib/python3.10/site-packages/pyFileFixity/pff.py", line 108, in main
return saecc_main(argv=subargs, command=fullcommand)
File "/home/user/.local/lib/python3.10/site-packages/pyFileFixity/structural_adaptive_ecc.py", line 574, in main
relfilepath_ecc = compute_ecc_hash_from_string(relfilepath, ecc_manager_intra, hasher_intra, max_block_size, resilience_rate_intra)
File "/home/user/.local/lib/python3.10/site-packages/pyFileFixity/structural_adaptive_ecc.py", line 203, in compute_ecc_hash_from_string
fpfile = BytesIO(b(string))
File "/home/user/.local/lib/python3.10/site-packages/pyFileFixity/lib/_compat.py", line 36, in b
return codecs.latin_1_encode(x)[0]
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 16-26: ordinal not in range(256)
Looking at the code, it seems that latin-1
is used as an internal encoding - which can indeed not handle some of the non-latin-1 characters:
if sys.version_info < (3,):
def b(x):
return x
else:
import codecs
def b(x):
if isinstance(x, _str):
return codecs.latin_1_encode(x)[0] # <-- here
else:
return x
Problematic filename had Ukrainian/Cyrillic characters, which I think are not a part of latin-1
encoding.
Example string: зображення
.
pyFileFixity version 3.1.4 installed with pip
. I'm on Python 3.10.12.