Fixed bugs
- hocr-split: Duplicate content in
<html> #58
- hocr-pdf:
ocr_line does not have to be a span (e.g. also a div is possible) #57
- hocr-check: Fix containment checks and metadata checks, add tests #52 #61 #62
Ongoing work
- Check handling of non ASCII characters in hOCR files #53
- Make hocr-tools fit for Python 3 #37
See details: v1.0.0...v1.0.1