Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support incremental scan between 2 snapshots #13

Merged

Conversation

@xxchan xxchan changed the title . incremental scan Mar 2, 2025
@xxchan xxchan force-pushed the xxchan/incremental branch 2 times, most recently from d4d27e6 to a1c426e Compare March 2, 2025 15:21
@xxchan xxchan changed the base branch from archive_dev_rebase_main_20241230 to dev_rebase_main_20250307 March 13, 2025 06:25
@xxchan xxchan force-pushed the xxchan/incremental branch 3 times, most recently from 3636dc3 to a66db13 Compare March 13, 2025 06:57
@xxchan xxchan marked this pull request as ready for review March 13, 2025 07:00
@xxchan xxchan changed the title incremental scan feat: support incremental scan between 2 snapshots Mar 13, 2025
@xxchan xxchan force-pushed the xxchan/incremental branch 2 times, most recently from 8e6aebb to c8b4714 Compare March 16, 2025 13:23
.
Signed-off-by: xxchan <[email protected]>
@xxchan xxchan force-pushed the xxchan/incremental branch from c8b4714 to 3c0e53e Compare March 16, 2025 14:55
@xxchan xxchan requested review from chenzl25 and ZENOTME March 17, 2025 02:50
@chenzl25 chenzl25 requested a review from Li0k March 17, 2025 03:04
Comment on lines 1331 to 1342
if !snapshot_ids.contains(&manifest_file.added_snapshot_id) {
continue;
}
let manifest = object_cache.get_manifest(manifest_file).await?;
let entries = manifest.entries().iter().filter(|entry| {
matches!(entry.status(), ManifestStatus::Added)
&& (
// Is it possible that the snapshot id here is not contained?
entry.snapshot_id().is_none()
|| snapshot_ids.contains(&entry.snapshot_id().unwrap())
)
});
Copy link
Collaborator

@chenzl25 chenzl25 Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think simply checking manifest status Added is not enough, since compaction will add an manifest entry with Added as well, however, we don't want to consume a compaction snapshot. We should detect whether a snapshot is a compaction snapshot. If the snapshot is a compaction, we should skip it. For compaction snapshot, I think we can just check if there is a manifest entry having ManifestStatus::Deleted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have filtered Append snapshots above:

let append_snapshots =
        ancestors_between(table_metadata, latest_snapshot_id, oldest_snapshot_id)
            .filter(|snapshot| matches!(snapshot.summary().operation, Operation::Append))

The definition of snapshot operation is:
(so compaction snapshot should be Replace)

/// The operation field is used by some operations, like snapshot expiration, to skip processing certain snapshots.
pub enum Operation {
    /// Only data files were added and no files were removed.
    Append,
    /// Data and delete files were added and removed without changing table data;
    /// i.e., compaction, changing the data file format, or relocating data files.
    Replace,
    /// Data and delete files were added and removed in a logical overwrite operation.
    Overwrite,
    /// Data files were removed and their contents logically deleted and/or delete files were added to delete rows.
    Delete,
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @Li0k

Copy link
Member Author

@xxchan xxchan Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a problem: Seems delete files are also included in Append snapshot. 🤔

Encountered an entry for a delete file in a data file manifest: 

ManifestEntry { status: Added, snapshot_id: Some(4746599406424318107), sequence_number: Some(4), file_sequence_number: Some(4), data_file: DataFile { content: EqualityDeletes, file_path: "s3://hummock001/iceberg_connection/public/t/data/11-00077-eq-del-0195a491-109c-75b1-a93c-e29d371c2d0b.parquet", file_format: Parquet, partition: Struct { fields: [], null_bitmap: BitVec<usize, bitvec::order::Lsb0> { addr: 0x8, head: 000000, bits: 0, capacity: 0 } [] }, record_count: 2, file_size_in_bytes: 680, column_sizes: {1: 82}, value_counts: {1: 2}, null_value_counts: {1: 0}, nan_value_counts: {}, lower_bounds: {1: Datum { type: Long, literal: Long(524235685507891200) }}, upper_bounds: {1: Datum { type: Long, literal: Long(524235873517572096) }}, key_metadata: None, split_offsets: [4], equality_ids: [1], sort_order_id: None } }

The java code uses a ignoreDeleted to further filter the manifest.
https://github.com/apache/iceberg/blob/6ec3de390d3fa6e797c6975b1eaaea41719db0fe/core/src/main/java/org/apache/iceberg/BaseIncrementalAppendScan.java#L86

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// Only data files were added and no files were removed.
Append,

Indeed, by definition, there should be no delete files in an append snapshot. However, RisingWave's iceberg sink would include delete files to the append snapshot. Shall we change the snapshot operation to Overwrite if it is not an append-only commit? But once we change it to OverWrite, it means we can't read any incremental data between the snapshots. It seems we should add the added data of overwrite snapshot to the incremental read as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some facts I found:

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ZENOTME If we commit delete files to the iceberg, it seems we should change the iceberg snapshot operation to OverWrite instead of Append.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's used in SparkChangelogTable https://iceberg.apache.org/docs/latest/spark-procedures/?h=changes#carry-over-rows

I think we should handle OverWrite as well. Ignoring delete files is acceptable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. kind of like "force-append-only"

Copy link
Collaborator

@chenzl25 chenzl25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@xxchan xxchan merged commit 4373389 into risingwavelabs:dev_rebase_main_20250307 Mar 18, 2025
11 checks passed
manifest_entry_data_ctx_tx
.clone()
.send(manifest_entry_context)
.await

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking over the code, and I’m wondering if there might be a chance for the logic to get stuck here when the channel’s buffer size is smaller than the number of entries.
It seems like the channel could block once the buffer fills up, and since the consumers only start after all entries are sent, it might cause a hang.
I could be missing something, though—do you think this could be an issue, or is there a detail I might not be seeing? I’d really appreciate your thoughts!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, curious how did you find here 👀

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came across your PR while browsing through this repo, and it caught my attention because I'm looking for a way to implement incremental ingestion for an Iceberg table using Rust. Great work on it!

Copy link
Member Author

@xxchan xxchan Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think your insight is correct here. I've refined implementation in #36

xxchan added a commit that referenced this pull request Mar 21, 2025
* fixup! feat: support incremental scan between 2 snapshots (#13)

refactor: refactor incremental scan

* clippy

Signed-off-by: xxchan <[email protected]>

* add some check

Signed-off-by: xxchan <[email protected]>

---------

Signed-off-by: xxchan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants