-
Notifications
You must be signed in to change notification settings - Fork 138
feat: support special characters in column names for filter parsing #557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: staging
Are you sure you want to change the base?
feat: support special characters in column names for filter parsing #557
Conversation
- Enhanced filter regex to handle quoted column names with double quotes and backticks - Added support for column names containing hyphens, spaces, dots, and other special characters - Updated GetFilter() method to extract column names from multiple capture groups - Added comprehensive test suite with 15+ test cases covering various scenarios: - Quoted column names with spaces and special characters - Mixed quoting styles (double quotes and backticks) - Complex filters with AND/OR logical operators - Edge cases and error conditions - Resolves issue with parsing filters for columns like "user-id", "order date", "item.price" Examples now supported: - "user-id" > 5 and `order date` = "2024-01-01" - `item.price` >= 100.50 or "column name" != "value"
|
Hi @GergesHany , |
|
@vaibhav-datazip Done |
… and use regex match indices directly in Condition struct initialization.
|
Hi @vaibhav-datazip, just a quick reminder to review my PR when you get a chance! |
will review your PR soon @GergesHany |
|
LGTM |
@vaibhav-datazip Thanks for the approval! I hope proceed with the merge soon |
types/stream_configured_test.go
Outdated
| { | ||
| name: "three conditions (not supported)", | ||
| filter: "a = 1 and b = 2 and c = 3", | ||
| expectError: true, | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this case is already added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this as you have already added this case above named too many conditions
|
hi @GergesHany feel free to join our slack to take the lead of this pr if you are facing issues |
|
@vaibhav-datazip Done |
thanks @GergesHany , will be reviewing soon |
types/stream_configured_test.go
Outdated
| func TestGetFilterManual(t *testing.T) { | ||
| testCases := []string{ | ||
| `status = "active"`, | ||
| `"user-id" > 5`, | ||
| "`order date` = \"2024-01-01\"", | ||
| `"user-id" > 5 and ` + "`item.price` <= 100", | ||
| `age >= 18 or status != "inactive"`, | ||
| } | ||
|
|
||
| for _, filter := range testCases { | ||
| cs := &ConfiguredStream{ | ||
| StreamMetadata: StreamMetadata{ | ||
| Filter: filter, | ||
| }, | ||
| } | ||
|
|
||
| result, err := cs.GetFilter() | ||
| if err != nil { | ||
| t.Logf("Filter: %s -> ERROR: %v", filter, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have recently merged unit tests into staging you can see that and make changes to make unit tests consistent , we are using testify for assert
types/stream_configured_test.go
Outdated
| func TestGetFilterManual(t *testing.T) { | ||
| testCases := []string{ | ||
| `status = "active"`, | ||
| `"user-id" > 5`, | ||
| `age >= 18 or status != "inactive"`, | ||
| } | ||
|
|
||
| for _, filter := range testCases { | ||
| cs := &ConfiguredStream{ | ||
| StreamMetadata: StreamMetadata{ | ||
| Filter: filter, | ||
| }, | ||
| } | ||
|
|
||
| result, err := cs.GetFilter() | ||
| if err != nil { | ||
| t.Logf("Filter: %s -> ERROR: %v", filter, err) | ||
| } else { | ||
| t.Logf("Filter: %s -> Conditions: %+v, LogicalOp: %s", filter, result.Conditions, result.LogicalOperator) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this manual testing function when you have already added another function which tests get filter ?
| import ( | ||
| "strings" | ||
| "testing" | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use testify for checking the test cases. if you have doubt, you can see the following unit test related files
- olake/types/catalog_test.go
- olake/utils/typeutils/resolver_test.go
- olake/utils/typeutils/flatten_test.go
| }, | ||
| expectError: false, | ||
| }, | ||
| // Supports double-quoted column identifiers (e.g., contains hyphen). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more test case can be added (if its not already present) , where column name is not passed in quotes
types/stream_configured_test.go
Outdated
| }, | ||
| // Simple comparison without spaces around operator. | ||
| { | ||
| name: "test case from user: a>b", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is test case from user: written in some of the test cases, we can give them better names
types/stream_configured_test.go
Outdated
| { | ||
| name: "triple equals with greater than", | ||
| filter: "a >=== b", | ||
| expectError: true, | ||
| }, | ||
| // Nonsensical operator sequences (four equals with >) should error. | ||
| { | ||
| name: "four equals with greater than", | ||
| filter: "a ====> b", | ||
| expectError: true, | ||
| }, | ||
| // Nonsensical operator sequences (four equals with <) should error. | ||
| { | ||
| name: "four equals with less than", | ||
| filter: "a ====< b", | ||
| expectError: true, | ||
| }, | ||
| // Nonsensical operator sequences mixing equals and not-equals should error. | ||
| { | ||
| name: "triple equals with not equal", | ||
| filter: "a ===!= b", | ||
| expectError: true, | ||
| }, | ||
| // Multiple equals signs (===) are invalid in this grammar. | ||
| { | ||
| name: "multiple equals signs", | ||
| filter: "a === b", | ||
| expectError: true, | ||
| }, | ||
| // Quadruple equals are invalid and should error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for these cases, names don't look good, you can use multiple instead of using triple quadruple and also remove redundant looking cases from here as well
types/stream_configured_test.go
Outdated
| filter: `"col\"name" = "val\"ue"`, | ||
| expectError: true, | ||
| }, | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove unnecessary line breaks
types/stream_configured_test.go
Outdated
| // Missing value after '=' should error. | ||
| { | ||
| name: "empty value after operator", | ||
| filter: "col = ", | ||
| expectError: true, | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this kind of test case is already present named missing value
types/stream_configured_test.go
Outdated
| { | ||
| name: "unquoted column with trailing space before op", | ||
| filter: "col = 5", | ||
| expectedFilter: Filter{ | ||
| Conditions: []Condition{ | ||
| {Column: "col", Operator: "=", Value: "5"}, | ||
| }, | ||
| LogicalOperator: "", | ||
| }, | ||
| expectError: false, | ||
| }, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test case is also covered in excessive whitespace
types/stream_configured_test.go
Outdated
| if actualCondition.Column != expectedCondition.Column { | ||
| t.Errorf("Condition[%d] Column mismatch: expected %q, got %q", i, expectedCondition.Column, actualCondition.Column) | ||
| } | ||
|
|
||
| if actualCondition.Operator != expectedCondition.Operator { | ||
| t.Errorf("Condition[%d] Operator mismatch: expected %q, got %q", i, expectedCondition.Operator, actualCondition.Operator) | ||
| } | ||
|
|
||
| if actualCondition.Value != expectedCondition.Value { | ||
| t.Errorf("Condition[%d] Value mismatch: expected %q, got %q", i, expectedCondition.Value, actualCondition.Value) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of these you can use assert from testify library
|
please also check older unresolved comments |
All comments have been resolved. |
Description
Examples now supported:
order date= "2024-01-01"item.price>= 100.50 or "column name" != "value"Fixes #498
Type of change
How Has This Been Tested?
Note
Improves filter parsing to support quoted column names, broader value formats, logical operators, and adds a comprehensive test suite.
types/stream_configured.go):strings.TrimSpace.NULLvalues.and/or).LogicalOperatorhandling.types/stream_configured_test.go):Written by Cursor Bugbot for commit a61b0fc. This will update automatically on new commits. Configure here.