Skip to content

Checkpoints not used by write_deltalake? #3410

Open
@rtyler

Description

@rtyler

Discussed in #2555

Originally posted by VLomonovskis May 30, 2024
Hello.

I have several processes that uses the same code and append data to the same Delta Table. Those processes run in parallel. I append data using write_deltalake and use rust engine to merge schema.

As several processes add data, performance degrading and upload takes more time. As I understand it happens because increases number of transaction log files. However, when I create checkpoint ( using delta_table.checkpoint() ), it does not improve performance and looks like write_deltalake reads all the logs before checkpoint. Can this behaviour be changed?

I did see discussions about checkpoint, but they where about checkpoint creation. In my case, checkpoints not used even when created.

Metadata

Metadata

Assignees

Labels

binding/rustIssues for the Rust cratebugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions