-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[MINORBUMP] ExpectColumnDistinctValuesToEqualSet with database-pushed comparison
#11616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 75 commits
Commits
Show all changes
79 commits
Select commit
Hold shift + click to select a range
3ea549d
Optimize expect_column_distinct_values_to_equal_set with database-pus…
NathanFarmer 28482f9
Fix circular import: move ValidationDependencies to TYPE_CHECKING block
NathanFarmer 79f2b07
Fix backward compatibility: return observed_value and handle type coe…
NathanFarmer b34f285
Fix: cast partial_unexpected_count to int, only include unexpected_co…
NathanFarmer 65dbe17
Fix type error: handle partial_unexpected_count type safely
NathanFarmer 3a4eeb9
Add integration tests for column.distinct_values.not_equal_set metric
NathanFarmer 3378358
Add date comparison tests for column.distinct_values.not_equal_set me…
NathanFarmer a3f653c
Remove test_dates_with_str_value_set from metric tests
NathanFarmer de6d219
Add result format integration tests for ExpectColumnDistinctValuesToE…
NathanFarmer 08843f7
Merge branch 'develop' into m/gx-2374/distinct-values-equal-set
NathanFarmer c150c43
Fix type errors: use result.result instead of to_json_dict
NathanFarmer 8645225
Fix value_counts comparison: use to_json_dict for proper serialization
NathanFarmer 1ab245a
Fix type errors: compare full result dict instead of nested access
NathanFarmer 88d0001
Merge branch 'develop' into m/gx-2374/distinct-values-equal-set
NathanFarmer 62cf62c
Remove unnecessary fallback to column.value_counts metric
NathanFarmer 1857d71
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 6abfb6c
Trigger build
NathanFarmer 7121c57
BREAKING: Remove column.value_counts and column.distinct_values
NathanFarmer b4a01ee
Update tests for breaking changes in expect_column_distinct_values_to…
NathanFarmer 92350b3
Change result format: observed_value=None, violations in partial_unex…
NathanFarmer 114216f
Restore renderer to show unexpected and missing values from details
NathanFarmer 2737215
Revert expectation to original column.value_counts implementation
NathanFarmer 4dfdf4f
Revert test expectations to match original behavior
NathanFarmer 8a42013
Merge branch 'develop' into m/gx-2374/distinct-values-equal-set
NathanFarmer e0eb783
Merge branch 'develop' into m/gx-2374/distinct-values-equal-set
NathanFarmer c199bc8
Limit observed_value to 1000 values to prevent 413 payload errors
NathanFarmer 79b7f7b
Limit value_counts to 1000 items to prevent 413 payload errors
NathanFarmer c6e78fc
Merge branch 'develop' into m/gx-2374/distinct-values-equal-set
NathanFarmer e121b93
Use shared MAX_DISTINCT_VALUES constant (500) to limit payload size
NathanFarmer c57a279
Implement database-pushdown for distinct values expectations
NathanFarmer f334b52
Fix circular import by moving MAX_DISTINCT_VALUES to constants.py
NathanFarmer c396e28
Move MAX_RESULT_RECORDS to constants.py for consistency
NathanFarmer 0198525
Fix mypy errors: update imports and add type ignore comments
NathanFarmer 6dfaa70
Fix remaining MAX_RESULT_RECORDS imports in test files
NathanFarmer 0355ce7
Update tests for database-pushdown result format
NathanFarmer 0163c4b
Fix tests: add type coercion to metrics and update equal_set renderer…
NathanFarmer 4790869
Revert be_in_set and contain_set integration tests to OLD format
NathanFarmer 398a7a9
Use original ov__/exp__ prefixes in equal_set renderer
NathanFarmer 8009411
Merge branch 'develop' into m/gx-2374/distinct-values-equal-set
NathanFarmer 7d2e31f
Trigger build
NathanFarmer dbfca64
Fix mypy error: add type annotation for coerced_set
NathanFarmer 6cca763
Add SQL type coercion for string dates to fix BigQuery tests
NathanFarmer a860163
Add database-pushdown metrics for distinct values set comparisons
NathanFarmer 2a0fec2
Refactor distinct values metrics into separate files
NathanFarmer a6506e3
Define _SQLALCHEMY_1_4_OR_GREATER locally in each file that uses it
NathanFarmer 70792ee
Rename missing_from_set metrics to missing_from_column
NathanFarmer 8f6d6a6
Use ScalarValue type alias instead of Any for coercion functions
NathanFarmer 100658d
Merge branch 'm/gx-2374/distinct-values-metrics' into m/gx-2374/disti…
NathanFarmer f6b692f
fix: remove duplicate metric class definitions causing F811 errors
NathanFarmer dedd2ba
fix: rename metric references from missing_from_set to missing_from_c…
NathanFarmer bad963e
fix: remove duplicate type coercion helper functions
NathanFarmer 4f61284
refactor: remove unnecessary get_validation_dependencies override - k…
NathanFarmer e6ca449
Merge branch 'develop' into m/gx-2374/distinct-values-equal-set
NathanFarmer d32e57d
fix: handle None limit in Spark/SQL implementations to prevent py4j e…
NathanFarmer 1ad3779
chore: remove unused distinct_values_not_equal_set metric
NathanFarmer 93dcea6
feat: add missing_count/partial_missing_list and unexpected_count/par…
NathanFarmer 6031ab5
fix: update integration test result format for equal_set
NathanFarmer 5cc55fb
docs: add partial_missing_list to result format documentation
NathanFarmer cce2d28
docs: clarify missing_count and partial_missing_list for distinct val…
NathanFarmer f532044
test: add result_format unit tests for equal_set
NathanFarmer b6fc1bd
Merge branch 'develop' into m/gx-2374/distinct-values-equal-set
NathanFarmer f1ea710
feat: respect partial_unexpected_count setting for partial_missing_list
NathanFarmer f5100b9
feat: use MAX_DISTINCT_VALUES (500) for partial lists limit in equal_set
NathanFarmer 80967ab
fix: use default partial_unexpected_count of 20, not 500
NathanFarmer beef742
fix: change metric default limit to MAX_DISTINCT_VALUES (500) and rem…
NathanFarmer ecd978b
feat: default partial_unexpected_count to MAX_DISTINCT_VALUES (500) f…
NathanFarmer 8df73c4
docs: update partial_unexpected_count default to 500 for distinct val…
NathanFarmer ba16e60
fix: add all four new metrics to public API exports
NathanFarmer 8b927d9
fix: reduce MAX_DISTINCT_VALUES from 500 to 200 to prevent 413 errors
NathanFarmer 2c88c60
docs: fix partial_missing_list default to 200 for distinct values Exp…
NathanFarmer a840e11
fix: actually slice partial lists to partial_unexpected_count
NathanFarmer 7ac3102
fix: always use MAX_DISTINCT_VALUES for distinct values expectations
NathanFarmer 81d81c4
fix: change MAX_DISTINCT_VALUES back to 500
NathanFarmer 42163a7
Change MAX_DISTINCT_VALUES from 500 to 20
NathanFarmer f6fe7e4
Fix redundant '20 or 20' in docs to just '20'
NathanFarmer 88a713b
Merge branch 'develop' into m/gx-2374/distinct-values-equal-set
NathanFarmer 597aade
Merge branch 'develop' into m/gx-2374/distinct-values-equal-set
NathanFarmer 120568f
Add pandas 3.0 logic
NathanFarmer dd37ccd
Fix type error
NathanFarmer File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,10 @@ | ||
| from typing import Final | ||
|
|
||
| DATAFRAME_REPLACEMENT_STR = "<DATAFRAME>" | ||
|
|
||
| # Maximum number of result records to return in expectation results | ||
| MAX_RESULT_RECORDS: Final[int] = 200 | ||
|
|
||
| # Maximum number of distinct values to return in expectation results | ||
| # to prevent payload size issues (e.g., HTTP 413 errors with GX Cloud) | ||
| MAX_DISTINCT_VALUES: Final[int] = 20 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.