Open
Description
Question/Bug
When using search_dates, it picks partial letters from neighboring words as well thus forming an invalid token and wrong datetime objects. For eg:
sample_string = 'Bubble -58.5 06 Mar 2009 in need of -43.4 30 Oct 1974 also contributed for -17.7 26 Dec 2018 '
en_dates = search_dates(sample_string, languages=['en'],settings={'STRICT_PARSING': True})
Output
[('5 06 Mar 2009 in', datetime.datetime(2009, 3, 5, 0, 0)),
('4 30 Oct 1974', datetime.datetime(1974, 10, 4, 0, 0)),
('7 26 Dec 2018', datetime.datetime(2018, 12, 7, 0, 0))]
It also picked
- letter 5 from -58.5 to 06 Mar 2009 forming '5 06 Mar 2009 in'
- letter 4 from -43.4 to 30 Oct 1974 forming '4 30 Oct 1974'
- letter 7 from -17.7 to 26 Dec 2018 forming '7 26 Dec 2018'
Either include the whole word or exclude it. Just including partial numbers/letters from previous words makes it an invalid token and wrong DateTime objects.
Expected Output
[('06 Mar 2009 in', datetime.datetime(2009, 3, 5, 0, 0)),
('30 Oct 1974', datetime.datetime(1974, 10, 4, 0, 0)),
('26 Dec 2018', datetime.datetime(2018, 12, 7, 0, 0))]