Skip to content

Date simplification technique is too simple #214

@stucka

Description

@stucka

Several transformers try to grab the first hunk of text before a space to determine a date. That's not a great approach if that first hunk of text is too small to be a valid date and also too small to be a good quasi-unique identifier.

In New York, for example, there's an American Airlines entry for "2 /12 /2021" that comes in as simply "2", which could conflict with other bad entries.

If the first hunk is too small to be a date (e.g., 1/1/23 for six characters) the whole string should probably be passed for a match.

value = value.split()[0].replace(",", "").replace(";", "")
Could be something like:

patched= value.split()[0].replace(",", "").replace(";", "")
if len(patched) >= 6:
    value = patched

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions