Skip to content

Conversation

@kokila-19
Copy link
Contributor

What changes were proposed in this pull request?

Implemented Natural ordering(write ordered by) ALTER TABLE DDL support for Iceberg tables.

Why are the changes needed?

Existing tables can be converted to natural ordered tables using ALTER command. Note that only the data inserted after alter query are ordered, not existing data.

Does this PR introduce any user-facing change?

yes, supported new syntax

SYNTAX:
ALTER TABLE table_name SET WRITE ORDERED BY column_name sort_direction NULLS FIRST/LAST, ...

EXAMPLE:
ALTER TABLE table_order SET WRITE ORDERED BY id desc nulls first, name asc nulls last;

How was this patch tested?

qtest

SYNTAX:
ALTER TABLE table_name SET WRITE ORDERED BY column_name sort_direction NULLS FIRST/LAST, ...

EXAMPLE:
ALTER TABLE table_order SET WRITE ORDERED BY id desc nulls first, name asc nulls last;
@sonarqubecloud
Copy link

sonarqubecloud bot commented Jan 3, 2026

@kokila-19 kokila-19 marked this pull request as ready for review January 4, 2026 03:03
* @return List of SortFieldDesc, or null if parsing fails or JSON is empty
*/
protected List<SortFieldDesc> parseSortFieldsJSON(String sortOrderJSONString) {
if (Strings.isNullOrEmpty(sortOrderJSONString) || isZOrderJSON(sortOrderJSONString)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


try {
SortFields sortFields = JSON_OBJECT_MAPPER.reader().readValue(sortOrderJSONString, SortFields.class);
if (sortFields != null && !sortFields.getSortFields().isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need .isEmpty() check?


public class HiveIcebergMetaHook extends BaseHiveIcebergMetaHook {
private static final Logger LOG = LoggerFactory.getLogger(HiveIcebergMetaHook.class);
private static final ObjectMapper JSON_OBJECT_MAPPER = new ObjectMapper();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? BaseHiveIcebergMetaHook already defines it

* SortOrder JSON and keep it in DEFAULT_SORT_ORDER for Iceberg to use it.
*/
private void setSortOrder(org.apache.hadoop.hive.metastore.api.Table hmsTable, Schema schema,
protected void setSortOrder(org.apache.hadoop.hive.metastore.api.Table hmsTable, Schema schema,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why protected?


// Check if we are setting regular sort order as it needs conversion from Hive JSON to Iceberg SortOrder
if (propertiesToSet.contains(TableProperties.DEFAULT_SORT_ORDER)) {
// If the HMS table has Hive SortFields JSON in default-sort-order
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe extract this logic into a separate method. also consider Set due to use of contains

/**
* Converts Hive SortDirection to Iceberg SortDirection.
*/
protected static org.apache.iceberg.SortDirection convertSortDirection(SortFieldDesc.SortDirection direction) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protected?

}

// Set other properties excluding default-sort-order which is already processed)
propertiesToSet.stream()
Copy link
Member

@deniskuzZ deniskuzZ Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add filter here and not handle DEFAULT_SORT_ORDER during propertiesToSet iteration?

Map<String, Consumer<String>> handlers = new HashMap<>();
handlers.put(
    TableProperties.DEFAULT_SORT_ORDER,
    k -> handleDefaultSortOrder(hmsTable, hmsTableParameters)
);

splitter.splitToList(contextProperties.get(SET_PROPERTIES)).forEach(key ->
    handlers.getOrDefault(
        key,
        k -> update.set(k, hmsTableParameters.get(k))
    ).accept(key)
);

// Regular ORDERED BY - to be implemented in future commit
throw new SemanticException("Regular ORDERED BY is not yet supported. Only ZORDER is supported.");
// Handle regular ORDERED BY
handleRegularOrder(tableName, partitionSpec, orderNode);
Copy link
Member

@deniskuzZ deniskuzZ Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SortFields sortFields = new SortFields(sortFieldDescList);
String sortOrderJson;
try {
sortOrderJson = JSON_OBJECT_MAPPER.writeValueAsString(sortFields);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we already have code that handles natural order during create ? can we reuse?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants