Skip to content

Conversation

@Youngwb
Copy link
Contributor

@Youngwb Youngwb commented Dec 26, 2025

Why I'm doing:

#66944

What I'm doing:

This pull request adds support for DELETE operations on Iceberg tables in StarRocks. It introduces a new planning and execution path for Iceberg DELETEs, including a dedicated data sink, planner logic, and analyzer logic. The changes ensure that DELETE statements on Iceberg tables are properly transformed, validated, and executed, with appropriate support for partitioned tables and required metadata columns.

Key changes include:

Iceberg DELETE support in planner and execution:

  • Added a new IcebergDeleteSink class to handle delete operations for Iceberg tables, validating required columns and configuring the data sink for position deletes. (IcebergDeleteSink.java)
  • Updated DeletePlanner to detect Iceberg tables and set up the new sink, including logic to create the necessary tuple descriptor and handle physical property requirements (e.g., shuffle on partition columns).
  • Extended the supported operations for IcebergTable to include DELETE.

Iceberg DELETE support in analyzer:

  • Added logic in DeleteAnalyzer to rewrite Iceberg DELETE statements into a SELECT on the table's metadata columns (_file, _pos, and partition columns), and to enforce restrictions such as requiring a WHERE clause and disallowing partition or USING/CTE clauses.

Test infrastructure updates:

  • Updated test mocks to include the required metadata columns (_file, _pos) in Iceberg table schemas for unit testing.

These changes collectively enable StarRocks to plan and execute DELETE statements on Iceberg tables using position deletes, with proper validation and partition-aware planning.

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.0
    • 3.5
    • 3.4
    • 3.3

Note

Enables DELETE on Iceberg tables using position deletes with planner, analyzer, and sink support.

  • New IcebergDeleteSink writes Parquet position delete files; validates _file (VARCHAR) and _pos (BIGINT); serializes via Thrift (ICEBERG_DELETE_SINK, TIcebergTableSink with data_location)
  • DeleteAnalyzer rewrites Iceberg DELETE to SELECT _file, _pos, [partition cols] FROM table WHERE ...; enforces required WHERE and disallows partitions/USING/CTEs
  • DeletePlanner detects IcebergTable, builds shuffle property on partition columns, materializes required columns, and sets the new sink
  • IcebergTable now advertises DELETE in getSupportedOperations
  • Tests: new unit tests for analyzer, planner, and sink; Iceberg mocks extended with hidden _file/_pos; minor test adjustments

Written by Cursor Bugbot for commit 7224bac. This will update automatically on new commits. Configure here.

@Youngwb Youngwb requested review from a team as code owners December 26, 2025 06:21
@wanpengfei-git wanpengfei-git requested a review from a team December 26, 2025 06:21
@sonarqubecloud
Copy link

@github-actions
Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link

[FE Incremental Coverage Report]

pass : 127 / 134 (94.78%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/planner/IcebergDeleteSink.java 54 59 91.53% [95, 100, 146, 151, 156]
🔵 com/starrocks/sql/DeletePlanner.java 42 44 95.45% [143, 290]
🔵 com/starrocks/sql/analyzer/DeleteAnalyzer.java 31 31 100.00% []

@github-actions
Copy link

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@alvin-celerdata
Copy link
Contributor

@cursor review

execPlan.getFragments().get(0).setPipelineDop(1);
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong sink flag set for Iceberg delete operations

The configurePipelineSink method unconditionally calls sinkFragment.setHasOlapTableSink() for all table types, including Iceberg tables. For Iceberg delete operations, this should call setHasIcebergTableSink() instead. The InsertPlanner correctly distinguishes between table types and sets the appropriate flag (setHasIcebergTableSink() for Iceberg tables). This mismatch could cause incorrect behavior in downstream code that checks which sink type is present using hasIcebergTableSink() vs hasOlapTableSink().

Fix in Cursor Fix in Web

* - _file (STRING): Path of the data file
* - _pos (BIGINT): Row position within the file
*/
public class IcebergDeleteSink extends DataSink {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you could make DeleteSink as a separate PR and then follow it with a delete plan PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants