Use OCR in custom PDF document parser if needed

Previously, OCR was used for images in PDF documents when those were parsed by Apache Tika. (Tika uses Apache Tesseract for OCR.)
Our custom PDF document parser based on Apache uses Apache PDFBox and currently does not use OCR.