Skip to content

Row conditions with accents in the value are not parsable #11026

Open
@Gudsfile

Description

@Gudsfile

Describe the bug

When a row_condition contains an accent, for example on the value under test, the project crashes, indicating a parsing error.
In fact, in the code (cf file great_expectations/expectations/row_conditions.py) the parsing of row_condition expects to find the characters of condition_value_chars, but this list is made up of alphanumeric pyparsing characters (alphanums) that do not contain accents. I think pyparsing alphas8bit characters should be added to the list.

condition_value_chars = alphanums + punctuation_without_apostrophe + WHITESPACE_CHARS
condition_value = Suppress('"') + Word(f"{condition_value_chars}._").setResultsName(
    "condition_value"
) + Suppress('"') ^ Suppress("'") + Word(f"{condition_value_chars}._").setResultsName(
    "condition_value"
) + Suppress("'")

to

condition_value_chars = alphanums + punctuation_without_apostrophe + WHITESPACE_CHARS + alphas8bit
# …

To Reproduce

Add an expectation with a row_condition with an accent :

      gxe.ExpectColumnDistinctValuesToBeInSet(
        column="column_name",
        value_set=["value_a", "value_b"],
        condition_parser="great_expectations",
        row_condition='col("column_partition")=="VàlueWithAccent"',
      )

Result is this error :

unable to parse condition: col("country_form")=='VàlueWithAccent'
File "…/site-packages/pyparsing/core.py", line 1212, in parse_string
raise exc.with_traceback(None)
pyparsing.exceptions.ParseException: Expected "'", found 'àlueWithAccent' (at char 23), (line:1, col:24)

Expected behavior

Expectation checking without parsing problems and taking into account the condition.

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    To Do

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions