Apache Iceberg Rust version
976b689
Describe the bug
Hi team, IIRC, current snapshot summary only considers latest committed transaction and discard previous ones.
- Within a
SnapshotProducer, snapshot_id is the id for snapshot to create, which gets initialized at producer construction
- When we try to get previous snapshot summary (to merge with current one), it's always return
None here
To Reproduce
#[tokio::test]
async fn test_transaction_snapshot_summary() {
let catalog = new_memory_catalog().await;
let table = make_v3_minimal_table_in_catalog(&catalog).await;
let mut file_seq = 0u32;
let mut append_file = |table: &crate::table::Table, record_count: u64, file_size: u64| {
file_seq += 1;
let file = DataFileBuilder::default()
.content(DataContentType::Data)
.file_path(format!("test/{file_seq}.parquet"))
.file_format(DataFileFormat::Parquet)
.file_size_in_bytes(file_size)
.record_count(record_count)
.partition(Struct::from_iter([Some(Literal::long(1))]))
.partition_spec_id(0)
.build()
.unwrap();
let tx = Transaction::new(table);
tx.fast_append()
.add_data_files(vec![file])
.apply(tx)
.unwrap()
};
let table = append_file(&table, /*record_count=*/ 10, /*file_size=*/ 100)
.commit(&catalog)
.await
.unwrap();
let table = append_file(&table, /*record_count=*/ 20, /*file_size=*/ 200)
.commit(&catalog)
.await
.unwrap();
let summary = &table
.metadata()
.current_snapshot()
.unwrap()
.summary()
.additional_properties;
assert_eq!(summary.get("total-records").unwrap(), "30");
assert_eq!(summary.get("total-data-files").unwrap(), "2");
assert_eq!(summary.get("total-files-size").unwrap(), "300");
}
gets
---- transaction::test::test_transaction_snapshot_summary stdout ----
thread 'transaction::test::test_transaction_snapshot_summary' (87414) panicked at crates/iceberg/src/transaction/mod.rs:636:9:
assertion `left == right` failed
left: "20"
right: "30"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Expected behavior
No response
Willingness to contribute
I can contribute a fix for this bug independently
Apache Iceberg Rust version
976b689
Describe the bug
Hi team, IIRC, current snapshot summary only considers latest committed transaction and discard previous ones.
SnapshotProducer,snapshot_idis the id for snapshot to create, which gets initialized at producer constructionNonehereTo Reproduce
gets
Expected behavior
No response
Willingness to contribute
I can contribute a fix for this bug independently