Open
Description
Dateparser currently does not recognize 'times' before 'dates' formats. When parsing through a document sometimes 'times' are before 'dates'. It currently brings back a null value if 'times' are before 'dates'. I suggest incorporating something like the following code to find dates and move them to be in front of times. Something like the example below will help to keep your dateparser format working if times are before dates.
import re
import logging
import dateparser
from dateparser import parse
def dateparser(dt_input):
try:
# Move 'Dates' to be in front of 'Times'
date = re.search(r'(\d+(/|-|\.){1}\d+(/|-|\.){1}\d{1,4})', dt_input)
dt_input = re.sub(date[1], '', dt_input).rstrip()
dt_input = str(date[1] +' '+ dt_input)
dt_input_parsed = parse(str(dt_input))
return dt_input_parsed
except Exception as e:
logging.warning(f"Error in finding time in {dt_input}: {e}")
example:
dateparser("10:56:58 PM UTC+2:00 2/22/2018")
datetime.datetime(2019, 2, 22, 22, 56, 58, tzinfo=<StaticTzInfo 'UTC+02:00'>)