Skip to content

[KYUUBI #7335] DynamicPartitionDataSingleWriter needs sort before write#7414

Closed
maomaodev wants to merge 1 commit intoapache:masterfrom
maomaodev:kyuubi-7335
Closed

[KYUUBI #7335] DynamicPartitionDataSingleWriter needs sort before write#7414
maomaodev wants to merge 1 commit intoapache:masterfrom
maomaodev:kyuubi-7335

Conversation

@maomaodev
Copy link
Copy Markdown
Contributor

Why are the changes needed?

Fix #7335. For DynamicPartitionDataSingleWriter, the records to be written are required to be sorted on partition and/or bucket column(s) before writing. See FileFormatDataWriter.scala

How was this patch tested?

UT

Was this patch authored or co-authored using generative AI tooling?

NO

@pan3793
Copy link
Copy Markdown
Member

pan3793 commented Apr 21, 2026

what about the bucket table?

@maomaodev
Copy link
Copy Markdown
Contributor Author

maomaodev commented Apr 22, 2026

what about the bucket table?

KSHC specifies new WriteJobDescription(..., bucketSpec = None, ...) in HiveWrite, so it currently doesn't recognize bucket tables/bucket columns. This also looks like a bug, do we need to add support for bucket tables in this issue?


override def description(): String = "Kyuubi-Hive-Connector"

override def requiredDistribution(): Distribution = Distributions.unspecified()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the future, we can extend this to support auto-inserting a rebalance before writing, to kill small files

@pan3793 pan3793 added this to the v1.12.0 milestone Apr 22, 2026
@pan3793
Copy link
Copy Markdown
Member

pan3793 commented Apr 22, 2026

let's handle bucket table independently, thanks, merging to master

@pan3793 pan3793 closed this in 100446c Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hive-connector two issues

2 participants