Open
Description
Is your feature request related to a problem? Please describe.
Since often words that continue on the following line are described as a character followed by a dash and 1+ whitespaces it would be useful to have a function clean_newline
that concatenates the text on newline.
def clean_newline(text: str, pattern: str = r"(\w+)-\s+(\w+)" ) -> str:
"""
The `clean_newline` function removes the hyphen and whitespace between two words in a given text.
:param text: A string that contains the text to be cleaned
:type text: str
:return: a modified version of the input text where any occurrence of a word followed by a hyphen
and whitespace, followed by another word, is replaced with just the two words concatenated together.
"""
return re.sub(pattern, r'\1\2', text)