[Enhancement] Add IcebergDeleteSink to support delete operations on Iceberg tables #67277

Youngwb · 2025-12-27T09:58:58Z

Why I'm doing:

What I'm doing:

This pull request introduces support for delete operations on Iceberg tables by adding a new IcebergDeleteSink class, its associated serialization logic, and comprehensive unit tests. The main focus is on enabling the planner to write position delete files to Iceberg tables, ensuring correct tuple validation and integration with the existing data sink infrastructure.

Iceberg Delete Sink Implementation:

Added a new IcebergDeleteSink class in fe/fe-core/src/main/java/com/starrocks/planner/IcebergDeleteSink.java to support delete operations for Iceberg tables, including tuple validation, configuration handling, and Thrift serialization.
Updated the TDataSinkType enum in gensrc/thrift/DataSinks.thrift to include the new ICEBERG_DELETE_SINK type for proper Thrift serialization and planner integration.

Testing and Validation:

Added a comprehensive test suite in fe/fe-core/src/test/java/com/starrocks/planner/IcebergDeleteSinkTest.java to verify tuple validation, Thrift serialization, and explain string output for the new sink.
Fixes #issue

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
- This pr needs auto generate documentation
This is a backport pr

Bugfix cherry-pick branch check:

Note

Iceberg delete sink

Adds IcebergDeleteSink (fe/fe-core/.../IcebergDeleteSink.java) to write Iceberg position delete files; validates tuple has _file (VARCHAR) and _pos (BIGINT), sets locations, compression, target file size, cloud config; provides explain output and Thrift serialization via TDataSinkType.ICEBERG_DELETE_SINK into TIcebergTableSink (uses data_location, file_format=parquet).

Thrift and tests

Extends gensrc/thrift/DataSinks.thrift with ICEBERG_DELETE_SINK.
Adds unit tests (IcebergDeleteSinkTest) covering tuple validation errors, Thrift serialization fields, and getExplainString().

^{Written by Cursor Bugbot for commit f63c4c7. This will update automatically on new commits. Configure here.}

…rg tables

sonarqubecloud · 2025-12-27T10:09:20Z

Quality Gate failed

Failed conditions
5.9% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

github-actions · 2025-12-27T13:21:02Z

[Java-Extensions Incremental Coverage Report]

✅ pass : 0 / 0 (0%)

github-actions · 2025-12-27T13:23:01Z

[FE Incremental Coverage Report]

✅ pass : 54 / 59 (91.53%)

file detail

	path	covered_line	new_line	coverage	not_covered_line_detail
🔵	com/starrocks/planner/IcebergDeleteSink.java	54	59	91.53%	[95, 100, 146, 151, 156]

github-actions · 2025-12-27T13:38:06Z

[BE Incremental Coverage Report]

✅ pass : 0 / 0 (0%)

alvin-celerdata · 2025-12-27T18:24:28Z

@cursor review

cursor · 2025-12-27T18:28:18Z

fe/fe-core/src/main/java/com/starrocks/planner/IcebergDeleteSink.java

+        tIcebergTableSink.setCompression_type(compression);
+        tIcebergTableSink.setTarget_max_file_size(targetMaxFileSize);
+        com.starrocks.thrift.TCloudConfiguration tCloudConfiguration = new com.starrocks.thrift.TCloudConfiguration();
+        cloudConfiguration.toThrift(tCloudConfiguration);


NullPointerException when toThrift() called before init()

The cloudConfiguration field is only initialized in the init() method, not in the constructor. However, toThrift() uses cloudConfiguration.toThrift(tCloudConfiguration) without any null check. If toThrift() is called before init(), this will throw a NullPointerException. This is inconsistent with the similar IcebergTableSink class, which initializes cloudConfiguration directly in its constructor as a final field, making it safe to call toThrift() immediately after construction.

Additional Locations (2)

fe/fe-core/src/main/java/com/starrocks/planner/IcebergDeleteSink.java#L49-L50

fe/fe-core/src/main/java/com/starrocks/planner/IcebergDeleteSink.java#L72-L78

cursor · 2025-12-27T18:28:18Z

fe/fe-core/src/main/java/com/starrocks/planner/IcebergDeleteSink.java

+                if (IcebergTable.FILE_PATH.equals(colName)) {
+                    hasFilePathColumn = true;
+                    if (!slot.getType().equals(VarcharType.VARCHAR)) {
+                        throw new StarRocksConnectorException("_file column must be type of VARCHAR");


Type validation too strict, rejects valid VARCHAR lengths

The type validation uses equals() to compare types, but ScalarType.equals() for VARCHAR types also compares the length field. Since VarcharType.VARCHAR has len=-1 (wildcard), any column with a specific length like VARCHAR(255) would fail validation with the error "_file column must be type of VARCHAR" even though it is a valid VARCHAR. The codebase provides isVarchar() and matchesType() methods for flexible type checking that ignore length differences. The same issue applies to the BIGINT check, though BIGINT doesn't have length variants so it's less likely to manifest there.

Additional Locations (1)

fe/fe-core/src/main/java/com/starrocks/planner/IcebergDeleteSink.java#L98-L100

alvin-celerdata · 2025-12-27T18:26:01Z

fe/fe-core/src/main/java/com/starrocks/planner/IcebergDeleteSink.java

+        boolean hasPosColumn = false;
+
+        for (SlotDescriptor slot : desc.getSlots()) {
+            if (slot.getColumn() != null) {


Suggested change

if (slot.getColumn() != null) {

if (slot.getColumn() == null) {

continue;

}

In this way, it will make the code more readable, because the indentation will be less.

alvin-celerdata · 2025-12-27T18:28:20Z

fe/fe-core/src/main/java/com/starrocks/planner/IcebergDeleteSink.java

+                    }
+                } else if (IcebergTable.ROW_POSITION.equals(colName)) {
+                    hasPosColumn = true;


Suggested change

}

} else if (IcebergTable.ROW_POSITION.equals(colName)) {

hasPosColumn = true;

}

continue;

}

if (IcebergTable.ROW_POSITION.equals(colName)) {

hasPosColumn = true;

[Feature] Add IcebergDeleteSink to support delete operations on Icebe…

f63c4c7

…rg tables

wanpengfei-git added the PROTO-REVIEW label Dec 27, 2025

wanpengfei-git requested a review from a team December 27, 2025 09:59

mergify bot assigned Youngwb Dec 27, 2025

Youngwb changed the title ~~[Feature] Add IcebergDeleteSink to support delete operations on Iceberg tables~~ [Enhancement] Add IcebergDeleteSink to support delete operations on Iceberg tables Dec 27, 2025

cursor bot reviewed Dec 27, 2025

View reviewed changes

alvin-celerdata reviewed Dec 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Enhancement] Add IcebergDeleteSink to support delete operations on Iceberg tables #67277

[Enhancement] Add IcebergDeleteSink to support delete operations on Iceberg tables #67277

Youngwb commented Dec 27, 2025 •

edited by cursor bot

Loading

Uh oh!

sonarqubecloud bot commented Dec 27, 2025

Uh oh!

github-actions bot commented Dec 27, 2025

Uh oh!

github-actions bot commented Dec 27, 2025

Uh oh!

github-actions bot commented Dec 27, 2025

Uh oh!

alvin-celerdata commented Dec 27, 2025

Uh oh!

cursor bot Dec 27, 2025

Uh oh!

cursor bot Dec 27, 2025

Uh oh!

alvin-celerdata Dec 27, 2025

Uh oh!

alvin-celerdata Dec 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-            if (slot.getColumn() != null) {
+            if (slot.getColumn() == null) {
+                continue;
+            }

[Enhancement] Add IcebergDeleteSink to support delete operations on Iceberg tables #67277

Are you sure you want to change the base?

[Enhancement] Add IcebergDeleteSink to support delete operations on Iceberg tables #67277

Conversation

Youngwb commented Dec 27, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why I'm doing:

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

Uh oh!

sonarqubecloud bot commented Dec 27, 2025

Quality Gate failed

Uh oh!

github-actions bot commented Dec 27, 2025

[Java-Extensions Incremental Coverage Report]

Uh oh!

github-actions bot commented Dec 27, 2025

[FE Incremental Coverage Report]

file detail

Uh oh!

github-actions bot commented Dec 27, 2025

[BE Incremental Coverage Report]

Uh oh!

alvin-celerdata commented Dec 27, 2025

Uh oh!

cursor bot Dec 27, 2025

Choose a reason for hiding this comment

NullPointerException when toThrift() called before init()

Uh oh!

cursor bot Dec 27, 2025

Choose a reason for hiding this comment

Type validation too strict, rejects valid VARCHAR lengths

Uh oh!

alvin-celerdata Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

alvin-celerdata Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Youngwb commented Dec 27, 2025 •

edited by cursor bot

Loading

NullPointerException when `toThrift()` called before `init()`