Skip to content

Conversation

@dontknow9179
Copy link
Contributor

@dontknow9179 dontknow9179 commented Dec 26, 2025

Why I'm doing:

When writing data to a Hive TextFile table using a Hive catalog with ^[ as the delimiter, all fields are merged into a single field when querying in StarRocks, causing data parsing errors.
The reason is that StarRocks doesn't set the parameters of serdeInfo when creating a new partition. So delimiters in TextFileFormatDesc are null when reading.

// use hive
CREATE TABLE IF NOT EXISTS insert_text_file_test(
data_dt string, 
filed1 string,
filed2 string
)
PARTITIONED BY (
dt string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\^['
STORED AS textfile;

//use starrocks
insert overwrite hive.db.insert_text_file_test partition(dt = '20251222')
select '20251222', 'field1', 'field2' 
union all
select '20251222',field1',null;

//result is wrong
select * from hive.db.insert_text_file_test;

What I'm doing:

Fixes #issue
Call setSerDeParameters() in buildHivePartition() so that StarRocks can get SerDeParameters during reading.

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.0
    • 3.5
    • 3.4
    • 3.3

Note

Fixes missing SerDe parameters on newly created Hive partitions, preserving delimiters for TextFile reads.

  • Pass table SerDe props via .setSerDeParameters(table.getSerdeProperties()) in HiveCommitter.buildHivePartition and HiveMetadata.addPartition
  • Extend HivePartition to carry serDeParameters (constructor, getter, builder)
  • Populate SerDeInfo parameters in HiveMetastoreApiConverter when converting partitions
  • Add/adjust tests: validate SerDe propagation in partition build and add-partition flows; update metastore tests to include setSerDeParameters

Written by Cursor Bugbot for commit 1f1885e. This will update automatically on new commits. Configure here.

dontknow9179 and others added 2 commits December 26, 2025 14:41
…tarRocks#67199)

## Why I'm doing:
When writing data to a Hive TextFile table using a Hive catalog with \u001b as the delimiter, all fields are merged into a single field when querying in StarRocks, causing data parsing errors.
The reason is that StarRocks doesn't set the parameters of serdeInfo when creating a new partition. So delimiters in TextFileFormatDesc are null when reading.
```
// use hive
CREATE TABLE IF NOT EXISTS insert_text_file_test(
data_dt string,
filed1 string,
filed2 string
)
PARTITIONED BY (
dt string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u001B'
STORED AS textfile;

//use starrocks
insert overwrite hive.db.insert_text_file_test partition(dt = '20251222')
select '20251222', 'field1', 'field2'
union all
select '20251222',field1',null;

//result is wrong
select * from hive.db.insert_text_file_test;
```
## What I'm doing:

Fixes #issue
Call setSerDeParameters() in buildHivePartition() so that StarRocks can get SerDeParameters during reading.

Signed-off-by: dontknow9179 <[email protected]>
(cherry picked from commit 7e78e7f)

# Conflicts:
#	fe/fe-core/src/test/java/com/starrocks/connector/hive/HiveMetadataTest.java
Signed-off-by: dontknow9179 <[email protected]>
@mergify
Copy link
Contributor

mergify bot commented Dec 26, 2025

🧪 CI Insights

Here's what we observed from your CI run for d1d3d53.

🟢 All jobs passed!

But CI Insights is watching 👀

Signed-off-by: dontknow9179 <[email protected]>
@dirtysalt dirtysalt merged commit ae7b46e into StarRocks:branch-3.4 Dec 26, 2025
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants