Skip to content

BUG: snapshot summary doesn't consider previous summary #2390

@dentiny

Description

@dentiny

Apache Iceberg Rust version

976b689

Describe the bug

Hi team, IIRC, current snapshot summary only considers latest committed transaction and discard previous ones.

  • Within a SnapshotProducer, snapshot_id is the id for snapshot to create, which gets initialized at producer construction
  • When we try to get previous snapshot summary (to merge with current one), it's always return None here

To Reproduce

#[tokio::test]
    async fn test_transaction_snapshot_summary() {
        let catalog = new_memory_catalog().await;
        let table = make_v3_minimal_table_in_catalog(&catalog).await;

        let mut file_seq = 0u32;
        let mut append_file = |table: &crate::table::Table, record_count: u64, file_size: u64| {
            file_seq += 1;
            let file = DataFileBuilder::default()
                .content(DataContentType::Data)
                .file_path(format!("test/{file_seq}.parquet"))
                .file_format(DataFileFormat::Parquet)
                .file_size_in_bytes(file_size)
                .record_count(record_count)
                .partition(Struct::from_iter([Some(Literal::long(1))]))
                .partition_spec_id(0)
                .build()
                .unwrap();
            let tx = Transaction::new(table);
            tx.fast_append()
                .add_data_files(vec![file])
                .apply(tx)
                .unwrap()
        };

        let table = append_file(&table, /*record_count=*/ 10, /*file_size=*/ 100)
            .commit(&catalog)
            .await
            .unwrap();
        let table = append_file(&table, /*record_count=*/ 20, /*file_size=*/ 200)
            .commit(&catalog)
            .await
            .unwrap();

        let summary = &table
            .metadata()
            .current_snapshot()
            .unwrap()
            .summary()
            .additional_properties;

        assert_eq!(summary.get("total-records").unwrap(), "30");
        assert_eq!(summary.get("total-data-files").unwrap(), "2");
        assert_eq!(summary.get("total-files-size").unwrap(), "300");
    }

gets

---- transaction::test::test_transaction_snapshot_summary stdout ----

thread 'transaction::test::test_transaction_snapshot_summary' (87414) panicked at crates/iceberg/src/transaction/mod.rs:636:9:
assertion `left == right` failed
  left: "20"
 right: "30"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Expected behavior

No response

Willingness to contribute

I can contribute a fix for this bug independently

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions