-
Couldn't load subscription status.
- Fork 537
fix: write_deltalake with mode="overwrite" mode and schema_mode=None does not overwrite schema metadata
#3747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
write_deltalake with mode="overwrite" mode and schema_mode=None does not overwrite schema metadata
50a7e2d to
3dc1ad5
Compare
|
Do we feel covered by existing tests for the following cases:
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3747 +/- ##
==========================================
+ Coverage 75.37% 75.55% +0.17%
==========================================
Files 145 145
Lines 43946 44424 +478
Branches 43946 44424 +478
==========================================
+ Hits 33126 33565 +439
+ Misses 9217 9215 -2
- Partials 1603 1644 +41 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This should be already catched by existing tests I believe, and also raise at the beginning of schema checking
This should be already catched by existing tests I believe, and also raise at the beginning of schema checking
I guess it depends on the type of change if we go from not nullable to nullable with schema mode = None, then it's incorrect. If our data however gets casted from nullable to not nullable, but the table_schema remains the same it's logically correct, but the data should ideally not be changed because it affects the parquet schema |
|
Can you expand more on what that means and any implications for this PR? I can't tell if what you said is just regarding testing or regarding the behavior this PR should change. |
The first two points is rather, that it shouldn't require changing existing tests since that's already covered, the last remark I feel we need to add a test for to be sure |
|
Added 2 test cases - please take a look. |
Signed-off-by: Frank Portman <[email protected]>
Signed-off-by: Frank Portman <[email protected]>
Signed-off-by: Frank Portman <[email protected]>
8380a5f to
a982489
Compare
|
Also added a more specific test for |
|
@rtyler you may want to squash my commits and give it the message from the PR title. That was my assumption as to what would happen when I gave relatively uninformative intra-PR commits, but it seems like they all landed as is 😄 . Thanks for taking a look! |
Description
write_deltalakewithmode="overwrite"andschema_mode=Nonewas overwriting the nullability constraints of various columns in the schema, instead of accepting the existing schema fully (or failing on drift, if there was actually schema drift besides column constraints).Upon reflection of the
mergeoperation, it turned out that worked as intended, but this PR still introduces a test to ensure the schema behavior is consistent between the two.Related Issue(s)
Closes #3744
Documentation
N/A