[Improve][Connector-V2][HBase] Support DATE/TIME/TIMESTAMP/DECIMAL in sink and fix DECIMAL deserialization #10291
+400
−11
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose of this pull request
During hive2hbase synchronization, if the upstream table contains DECIMAL/DATE/TIME/TIMESTAMP fields, the HBase sink fails at write time with HbaseConnectorException(COMMON-07 UNSUPPORTED_DATA_TYPE). A short-term workaround is to use transform/cast (e.g., cast DECIMAL/DATE/TIMESTAMP to STRING or BIGINT).
However, from a consistency perspective, the read side (HBaseDeserializationFormat) already attempts to support DATE/TIME/TIMESTAMP/DECIMAL (and the previous DECIMAL-as-float behavior was also unreasonable), while the write side does not, creating an obvious read/write asymmetry. Therefore, this PR completes sink-side support and unifies read/write encoding rules to provide a consistent “semantic closure” within the connector and make common hive2hbase scenarios more out-of-the-box.
Does this PR introduce any user-facing change?
Adds DECIMAL/DATE/TIME/TIMESTAMP/BYTES support to HBase sink serialization with consistent encoding rules (string bytes for precision/safety).
Fixes DECIMAL deserialization to parse BigDecimal from string first, with a backward-compatible float fallback for legacy data.
Adds unit and e2e coverage for these types.
How was this patch tested?
Unit: HbaseSinkWriterTypeConvertTest.java
E2E: fake-to-hbase-with-date-time-decimal.conf + hbase-to-assert-with-date-time-decimal.conf and IT method testHbaseSinkWithDateTimeDecimal
Check list
New License Guide
incompatible-changes.mdto describe the incompatibility caused by this PR.