Open
Description
Describe the bug
When a row_condition contains an accent, for example on the value under test, the project crashes, indicating a parsing error.
In fact, in the code (cf file great_expectations/expectations/row_conditions.py
) the parsing of row_condition expects to find the characters of condition_value_chars
, but this list is made up of alphanumeric pyparsing characters (alphanums
) that do not contain accents. I think pyparsing alphas8bit
characters should be added to the list.
condition_value_chars = alphanums + punctuation_without_apostrophe + WHITESPACE_CHARS
condition_value = Suppress('"') + Word(f"{condition_value_chars}._").setResultsName(
"condition_value"
) + Suppress('"') ^ Suppress("'") + Word(f"{condition_value_chars}._").setResultsName(
"condition_value"
) + Suppress("'")
to
condition_value_chars = alphanums + punctuation_without_apostrophe + WHITESPACE_CHARS + alphas8bit
# …
To Reproduce
Add an expectation with a row_condition
with an accent :
gxe.ExpectColumnDistinctValuesToBeInSet(
column="column_name",
value_set=["value_a", "value_b"],
condition_parser="great_expectations",
row_condition='col("column_partition")=="VàlueWithAccent"',
)
Result is this error :
unable to parse condition: col("country_form")=='VàlueWithAccent'
File "…/site-packages/pyparsing/core.py", line 1212, in parse_string
raise exc.with_traceback(None)
pyparsing.exceptions.ParseException: Expected "'", found 'àlueWithAccent' (at char 23), (line:1, col:24)
Expected behavior
Expectation checking without parsing problems and taking into account the condition.
Additional context
Add any other context about the problem here.
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
To Do