Skip to content

Parsing relative dates error with Taiwanese Mandarin #913

Open
@kss149

Description

@kss149

I have trouble parsing the following string "2分鐘前". It should translate to "2 minutes ago" in Mandarin (checked with DeepL, GoogleTranslate, etc), but parsing returns None.

>>> dateparser.parse("2分鐘前")
>>> dateparser.parse("2 分鐘前")
datetime.datetime(2021, 4, 28, 20, 45, 19, 266428)

I added a space in between the number and the symbols and it works, but ideally I would like it to work without adding it.
I think it may be because I got it from a Taiwanese version of a website, but I am no expert.

I translated back to Mandarin "2 minutes ago" and got "2分钟前" (notice the symbol in the middle is different), which parses fine with and without a space, but it'd be great if the ?Taiwanese? version will work as well.

>>> dateparser.parse("2分钟前")
datetime.datetime(2021, 4, 28, 20, 45, 32, 129967)
>>> dateparser.parse("2 分钟前")
datetime.datetime(2021, 4, 28, 20, 45, 35, 659334)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: Bug - LanguageSubtype of bug, related to language data

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions