-
Couldn't load subscription status.
- Fork 537
Closed
Labels
binding/rustIssues for the Rust crateIssues for the Rust cratebugSomething isn't workingSomething isn't workingmre-neededWhether an MRE needs to be providedWhether an MRE needs to be provided
Milestone
Description
Environment
Delta-rs version: 0.22.2
Binding: python 0.22.2
Environment: python 3.13.0
- OS: macOS 13.7.1 (22H221) / Kernel Version: Darwin 22.6.0
- Other:
Bug
What happened:
When compacting a delta table of about 273 files (380MB) partitioned on a field 'year_month' (3 partitions), the listing of the table files seems invalid:
- new parquet files are correctly added (2 files by partition -> 6 files)
- previous files seem correctly marked as 'remove' (273 'remove' in the commit log of the 'OPTIMIZE' operation
dt.files()lists 205 files, I don't think it's expected- `dt.get_add_actions() also lists 205 files, I'm quite sure it's not what is expected
- when vacuum (with proper params) is done on the table, it seems to rely on the listed files and keep 205 files
Am I missing something ?
Log file of the 'OPTIMIZE' commit:
00000000000000000274.json
Path column of the get_add_actions():
get_add_actions.json
What you expected to happen:
dt.files()should list 6 files ?- `dt.get_add_actions() should list 6 files
- vacuum should only left 6 files untouched
How to reproduce it:
I made a repo with the code used for the test. Use the branch deltars-issue-sample: [email protected]:sebvey/delta-optim.git
I made the README.md as clear as possible.
Metadata
Metadata
Assignees
Labels
binding/rustIssues for the Rust crateIssues for the Rust cratebugSomething isn't workingSomething isn't workingmre-neededWhether an MRE needs to be providedWhether an MRE needs to be provided