Skip to content

post_commithook_properties checkpoint flag is not honored #3780

@havrylenkok

Description

@havrylenkok

Environment

Delta-rs version: 1.1.4

Binding:

Environment:

  • Cloud provider: AWS
  • OS:
  • Other: python 3.11, table is written to S3 with DDB locking

Bug

What happened:

write_deltalake(
    dt,
    arrow_table,
    mode="append",
    schema_mode=None,
    storage_options=storage_options,
    writer_properties=writer_properties,
    post_commithook_properties=PostCommitHookProperties(create_checkpoint=False),
    partition_by=["partition_date"],
)

creates checkpoints every 100 commits

What you expected to happen:

no automatic checkpoints are created

How to reproduce it:

run the snippet to write to the table and create 101 commits

More details:

docs on these https://delta-io.github.io/delta-rs/api/transaction/#deltalake.PostCommitHookProperties

Reading the table created with deltalake in S3 via Athena often doesn't work because of disagreement on contents of last checkpoint file in this library and trino trinodb/trino#18760

I'm trying to disable any kind of automatic checkpoint creation so then when I create them manually via table.create_checkpoint I fix the file in S3 immediately and do not break queries.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions