Skip to content

Table merge crashes without error #3094

@victor-ab

Description

@victor-ab

Delta-rs version:
0.22.3

Binding:

I am using polars, which uses table.merge(data, **delta_merge_options)

writer_props = WriterProperties(
            compression="UNCOMPRESSED",
            column_properties={
                "file_hash": ColumnProperties(
                    bloom_filter_properties=BloomFilterProperties(
                        set_bloom_filter_enabled=True,
                        fpp=0.01,
                    ),
                    dictionary_enabled=True,
                ),
                "file_content": ColumnProperties(max_statistics_size=0),
            },
            statistics_truncate_length=200,
        )

df.write_delta(
                's3://mytable',
                mode="overwrite",
                storage_options=storage_options,
                delta_write_options={
                    "schema_mode": "overwrite",
                    "writer_properties": writer_props,
                    "configuration": {"delta.logRetentionDuration": "interval 1 second"},
                },
            )

df.write_delta(
                    's3://mytable',
                    mode="merge",
                    storage_options=storage_options,
                    delta_merge_options={
                        "predicate": "target.file_hash = source.file_hash",
                        "source_alias": "source",
                        "target_alias": "target",
                        "writer_properties": writer_props,
                    },
                )
                .when_matched_update(updates={"updated_at": "source.updated_at"})
                .when_not_matched_insert_all()
                .execute()
            )

Environment:

  • Cloud provider: S3
  • OS: Windows WSL
  • Other: Polars 1.18.0

Bug

[2025-01-01T20:09:18Z DEBUG deltalake_aws::credentials] Located cached credentials
[2025-01-01T20:09:18Z DEBUG deltalake_aws::credentials] Cached credentials are still valid, returning
[2025-01-01T20:09:18Z DEBUG hyper_util::client::legacy::pool] reuse idle connection for ("https", s3.dualstack.us-east-2.amazonaws.com)
[2025-01-01T20:09:18Z DEBUG hyper_util::client::legacy::pool] pooling idle connection for ("https", s3.dualstack.us-east-2.amazonaws.com)
[2025-01-01T20:09:39Z DEBUG hyper_util::client::legacy::client] client connection error: connection closed before message completed
[2025-01-01T20:09:39Z DEBUG hyper_util::client::legacy::client] client connection error: connection closed before message completed
[2025-01-01T20:09:40Z DEBUG hyper_util::client::legacy::client] client connection error: connection closed before message completed
[2025-01-01T20:10:14Z DEBUG hyper_util::client::legacy::pool] pooling idle connection for ("https", s3.dualstack.us-east-2.amazonaws.com)
[2025-01-01T20:10:16Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:16Z DEBUG deltalake_core::operations::writer] Writing file with estimated size 135682254 to disk.
[2025-01-01T20:10:16Z DEBUG deltalake_aws::credentials] AWSForObjectStore is unlocking..
[2025-01-01T20:10:16Z DEBUG deltalake_aws::credentials] Located cached credentials
[2025-01-01T20:10:16Z DEBUG deltalake_aws::credentials] Cached credentials are still valid, returning
[2025-01-01T20:10:16Z DEBUG hyper_util::client::legacy::pool] reuse idle connection for ("https", s3.dualstack.us-east-2.amazonaws.com)
[2025-01-01T20:10:27Z DEBUG hyper_util::client::legacy::pool] pooling idle connection for ("https", s3.dualstack.us-east-2.amazonaws.com)
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:27Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:28Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:29Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:30Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:31Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:32Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:32Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:33Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:34Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:35Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:35Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:36Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:37Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:37Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:37Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:38Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:38Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:38Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:39Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:39Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:41Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
[2025-01-01T20:10:41Z DEBUG deltalake_core::operations::write] write_execution_plan_with_predicate did not send any batches, no sender.
DEBUG Command exited with signal: Some(9)

What happened:
It crashes without any error

What you expected to happen:
I expected it to execute the merge

How to reproduce it:
I am trying to reproduce it with some dummy data. Will share if I manage to do it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions