Open
Description
Is your feature request related to a problem? Please describe.
When processing PDFs via by_title, a common issue are words being split across lines due to line breaks or hyphenation. Example, in the text string, I end up with 'powerful capabili- ties of' instead of "powerful capabilities of".
Describe the solution you'd like
Word to be merged if a line break is detected.
Describe alternatives you've considered
No alternative option exists.
Additional context
Add any other context or screenshots about the feature request here.