[Bug]: Sweep phase deletes live files when multiple content IDs share the same base location

### What happened

When using a time-based cutoff (`-c P3D`), the sweep phase can delete files that are still referenced by live snapshots. This happens when the same Iceberg table base location is associated with multiple content IDs in the Nessie commit history.

The sweep operates per content-id independently. When an older content ID and a newer content ID share the same S3 base location, the older content ID's sweep builds a bloom filter containing only its own snapshot's files. It then lists all files in the shared base location and marks everything not in its bloom filter as expired — including files that belong to the newer (live) content ID's snapshots.

When `deferred-deletes` runs, it deletes the union of all deferred files, causing data loss.

### How to reproduce it


1. Have Iceberg tables managed by Nessie with active Flink streaming jobs continuously committing snapshots
2. Run `nessie-gc.jar mark-live` with `-c P3D` (3-day cutoff)
3. Run `nessie-gc.jar sweep --defer-deletes`
4. Run `nessie-gc.jar deferred-deletes`

### Nessie server type (docker/uber-jar/built from source) and version

nessie-gc 0.107.5

### Client type (Ex: UI/Spark/pynessie ...) and version

_No response_

### Impact

After `deferred-deletes` completed:
- All Iceberg tables became unreadable (NoSuchKeyException on data files and metadata)
- Flink streaming jobs failed with file-not-found errors
- Downstream query engines (Doris) returned S3 404 errors on all tables

### Expected behavior

The sweep phase should aggregate live files across ALL content IDs that share a base location before determining which files are orphaned. A file should only be considered expired if it is not referenced by ANY live content version for that base location.

### Workaround

Using `-c NONE` (default, no cutoff) avoids the issue because all commits are considered live, so all content versions' files are protected. However, this means old snapshots and their data files are never cleaned up, which defeats the purpose of running GC.

### Additional finding

The data loss only affected tables with **upsert (equality delete / copy-on-write) enabled**. Append-only tables were unaffected.

his is consistent with the root cause: upsert tables rewrite data files on updates, so old snapshots reference different files than new snapshots for the same base location. When the sweep processes an old content ID, its bloom filter does not contain the rewritten files from newer snapshots, marking them as expired.

Append-only tables are safe because each snapshot only adds files — older snapshots reference a subset of the current files, so no content ID's sweep will mark current files as expired.

### Environment

- nessie-gc 0.107.5
- Iceberg tables with S3FileIO
- Flink streaming jobs continuously writing to tables
- PostgreSQL as JDBC backend for GC live-content-sets
- Tested on AWS S3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Sweep phase deletes live files when multiple content IDs share the same base location #12363

What happened

How to reproduce it

Nessie server type (docker/uber-jar/built from source) and version

Client type (Ex: UI/Spark/pynessie ...) and version

Impact

Expected behavior

Workaround

Additional finding

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: Sweep phase deletes live files when multiple content IDs share the same base location #12363

Description

What happened

How to reproduce it

Nessie server type (docker/uber-jar/built from source) and version

Client type (Ex: UI/Spark/pynessie ...) and version

Impact

Expected behavior

Workaround

Additional finding

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions