Releases: kermitt2/pdfalto
v0.6.0
What's Changed
- Correct word break and space addition in case of caracter composition by @kermitt2 in #121
- Apple macOS ARM support by @lfoppiano in #154
- add github actions by @lfoppiano in #169
- support for Apple arm and improve the github action build by @lfoppiano in #171
- Fix IWord::colortoString(): Round instead of truncate by @th-we in #158
- Fix #144 by @SchrodingersMind in #151
- Update drawChar() in OutputDev interface to xpdf-4.05 by @flydutch in #172
- Update xpdf 4.05 by @flydutch in #176
- Update of ICU libraries for pdfalto by @AaronNGray in #177
- Update external libraries by @lfoppiano in #180 #181 #193 #194
- Modified nearly all sprintf's to snprintf's in src/. by @AaronNGray in #182
- Add build for external libraries by @lfoppiano in #184
- MacOS/x64 library builds. by @AaronNGray in #188
- Collect and aggregate pdfalto binaries by @lfoppiano in #195
- MacOS/X64 fontconfig path support by @AaronNGray in #196
- Extend backward compatibility up to ubuntu 20.04 (or glibc 3.4.32) by @lfoppiano in #197
- Update ci-build-libs.yml - update to libpng-1.6.47 by @AaronNGray in #198
- Fix problem with macos build and cmake version by @lfoppiano in #199
- Remove casting between TextWord and TextRawWord by @flydutch in #202
- removed forced cast from TextRawWord to TextWord by @flydutch in #203
- Update external libraries and make the build automatic and reproducible by @lfoppiano in #178
- Add library version environment variables to ci-build-libs.yml by @AaronNGray in #205
- Feature/update xpdf 4.05 by @flydutch in #206
- Enable scanning by @lfoppiano in #207
- Fix regression from #182 by @flydutch in #208
- Fix potential memory leaks on XPdf branch by @flydutch in #210
- Feature/update xpdf 4.05 fix regression by @lfoppiano in #212
- Improve line number recognition by @lfoppiano in #214
- Update parameters for dumping / skipping images by @lfoppiano in #219
- Update to xpdf 4.05 by @lfoppiano in #173
- Update documentation by @lfoppiano in #221
- chore: add libfontconfig by @lfoppiano in #222
- Add libfontconfig-dev to install_deps.sh by @th-we in #175
- Automate release by @lfoppiano in #223
New Contributors
- @th-we made their first contribution in #158
- @SchrodingersMind made their first contribution in #151
- @AaronNGray made their first contribution in #177
- @flydutch made their first contribution in various PRs and issues
Full Changelog: 0.4...v0.6.0
Version 0.4
New in version 0.4 (apart various bug fixes):
-
support for xpdf language support package for language-specific fonts like Arabic, Chinese-simplified, Japanese, etc. they are pre-installed locally and portable
-
refined line number detection and fixing a bug which could result in random missing numbers in the ALTO output
-
update to xpdf-4.03
-
fix issue with character spacing due to invalid rotation condition
-
update dependencies and dependency install script
Version 0.3
New in version 0.3:
-
line number detection: line numbers (typically added for review in manuscripts/preprints) are specifically identified and not anymore mixed with the rest of text content, they will be grouped in a separate block or, optionally, not outputted in the ALTO file (
noLineNumbersoption) -
removal of
-blocksoption, the block information are always returned for ensuring ALTO validation (<TextBlock>element) -
bug fixing on reading order
-
fix possible incorrect XMax and YMax values at 0 on block coordinates having only one line
Version 0.2
New in version 0.2:
- support Unicode composition of characters
- generalize reading order to all blocks (it was limited to the blocks of the first page)
- use subscript/superscript text font style attribute
- use SVG as a format for vectorial images
- propagate unsolved character Unicode value (free Unicode range for embedded fonts) as encoded special character in ALTO (so-called "placeholder" approach)
- generate metadata information in a separate XML file (as ALTO schema does not support that)
- use the latest version of xpdf, version 4.00
- add cmake
- ALTO output is replacing custom Xerox XML format
Note: this released version was used for Grobid release 0.5.6