Skip to content

[Bug] [connector-jdbc] [connector-iceberg] All NTZ timestamp columns are stored as LTZ in iceberg table #10685

@doyong365

Description

@doyong365

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

NTZ timestamp columns from source databases are stored as LTZ in the target without timezone conversion. SeaTunnel's JDBC connectors do not distinguish NTZ from LTZ — both are mapped to the same internal type (SqlType.TIMESTAMP / LOCAL_DATE_TIME_TYPE) and unconditionally written as LTZ to the target (e.g. Iceberg TimestampType.withZone()). When reading, rs.getTimestamp().toLocalDateTime() applies the JVM timezone, and Timestamp.valueOf(LocalDateTime).toInstant() in the Spark layer reverses it — the two offsets cancel out, leaving the wall-clock value stored directly as UTC regardless of the source timezone.

Affected source types per database:

Source DB Type NTZ/LTZ Result
MySQL DATETIME NTZ ⛔ Stored as LTZ
MySQL TIMESTAMP LTZ ✅ Correct
MariaDB DATETIME NTZ ⛔ Stored as LTZ
PostgreSQL timestamp NTZ ⛔ Stored as LTZ
PostgreSQL timestamptz LTZ ✅ Correct
Oracle DATE NTZ ⛔ Stored as LTZ
Oracle TIMESTAMP NTZ ⛔ Stored as LTZ
MSSQL DATETIME NTZ ⛔ Stored as LTZ
DB2 TIMESTAMP NTZ ⛔ Stored as LTZ

Verified with the following test cases (Spark engine, JVM=KST+9):

Source TZ Source type Source value Result
KST(+9) TIMESTAMP_NTZ 2026-01-01 00:00:00 2026-01-01 00:00:00 UTC ❌
KST(+9) TIMESTAMP_LTZ 2026-01-01 00:00:00 KST 2025-12-31 15:00:00 UTC ✅
UTC TIMESTAMP_NTZ 2026-01-01 00:00:00 2026-01-01 00:00:00 UTC ➖
CST(-6) TIMESTAMP_NTZ 2026-01-01 00:00:00 2026-01-01 00:00:00 UTC ❌
CST(-6) TIMESTAMP_LTZ 2026-01-01 00:00:00 CST 2026-01-01 06:00:00 UTC ✅

The NTZ case at UTC timezone appears correct only by coincidence (zero offset), not because of correct handling.

Root cause

The core issue is that SeaTunnel's internal type system has no way to distinguish NTZ from LTZ timestamp types. All DB-specific TypeConverters collapse both into the same internal type (SqlType.TIMESTAMP / LOCAL_DATE_TIME_TYPE), and the entire pipeline unconditionally writes every timestamp as LTZ to the target.

1. TypeConverters — NTZ and LTZ both mapped to LOCAL_DATE_TIME_TYPE
All DB-specific TypeConverters map NTZ and LTZ timestamp types to the identical SeaTunnel internal type, making it structurally impossible to distinguish them downstream.

  • MySqlTypeConverter.java convert(BasicTypeDefine)DATETIME (NTZ) and TIMESTAMP (LTZ) both map to LOCAL_DATE_TIME_TYPE
  • PostgresTypeConverter.java convert(BasicTypeDefine)timestamp (NTZ) and timestamptz (LTZ) both map to LOCAL_DATE_TIME_TYPE
  • OracleTypeConverter.java convert(BasicTypeDefine)DATE, TIMESTAMP (NTZ) and TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH LOCAL TIME ZONE (LTZ) all map to LOCAL_DATE_TIME_TYPE
  • SqlServerTypeConverter.java convert(BasicTypeDefine)DATETIME, DATETIME2 (NTZ) and DATETIMEOFFSET (LTZ) all map to LOCAL_DATE_TIME_TYPE
  • GenericTypeConverter.java convert(BasicTypeDefine)Types.TIMESTAMP (covers both NTZ and LTZ reported by JDBC driver) maps to LOCAL_DATE_TIME_TYPE

2. AbstractJdbcRowConverter.javatoInternal()
NTZ and LTZ columns are read through the same code path. rs.getTimestamp() applies the JVM default timezone, which is correct for LTZ but corrupts NTZ wall-clock values. There is no separate TIMESTAMP_TZ case.

3. AbstractJdbcRowConverter.javasetValueToStatementByDataType()
On write, Timestamp.valueOf(LocalDateTime) re-applies the JVM default timezone, shifting the epoch value regardless of whether the original source type was NTZ or LTZ.

4. InternalRowConverter.javaconvert() / reconvert()
In the Spark engine translation layer, toLocalDateTime() in AbstractJdbcRowConverter first maps the NTZ wall-clock value to a LocalDateTime while applying the JVM timezone offset (e.g. 2026-01-01T00:00:00Z epoch → KST+9 → 2026-01-01T09:00:00), and then Timestamp.valueOf() in InternalRowConverter immediately reverses it (e.g. KST-9 → epoch 2026-01-01T00:00:00Z). The two JVM timezone offsets cancel each other out, so the NTZ wall-clock numeric value is preserved as-is in UTC — meaning a value that represents 2026-01-01 00:00:00 KST is incorrectly stored as 2026-01-01 00:00:00 UTC instead of the correct 2025-12-31 15:00:00 UTC.

5. TypeConverterUtils.javaconvert(SeaTunnelDataType)
SeaTunnel's TIMESTAMP (NTZ) is mapped to Spark's TimestampType which is always UTC-based (LTZ). There is no NTZ-aware handling anywhere in the codebase.

6. IcebergTypeMapper.javatoIcebergType()
SeaTunnel's TIMESTAMP is unconditionally mapped to Types.TimestampType.withZone() (LTZ). withoutZone() (NTZ) is never called.

Expected behavior

  • NTZ source columns should be stored as NTZ in the target (e.g. Iceberg TimestampType.withoutZone())
  • LTZ source columns should be stored as LTZ in the target (e.g. Iceberg TimestampType.withZone())
  • The distinction should be preserved end-to-end through the SeaTunnel type system using SqlType.TIMESTAMP (NTZ) vs SqlType.TIMESTAMP_TZ (LTZ)

Actual behavior

All NTZ timestamp columns are stored as LTZ in the target. The wall-clock value is written directly as UTC without timezone conversion, because SeaTunnel has no mechanism to carry NTZ/LTZ distinction through its internal type system.

SeaTunnel Version

2.3.10

SeaTunnel Config

env {
  parallelism = 1
  job.mode = "BATCH"
}

source {
  Jdbc {
    url = "jdbc:mysql://localhost:3306/testdb"
    driver = "com.mysql.cj.jdbc.Driver"
    user = "root"
    password = "password"
    query = "SELECT datetime_col, timestamp_col FROM test_table"
  }
}

sink {
  Iceberg {
    catalog_name = "local"
    namespace = "default"
    table = "test_iceberg"
  }
}

Running Command

./bin/start-seatunnel-spark-3-connector-v2.sh --config config/test.conf

Error Exception

No explicit exception is thrown. The job completes successfully but NTZ timestamp values are stored with incorrect timezone semantics in the target.

Zeta or Flink or Spark Version

zeta 2.3.10
flink 1.17.2
spark 3.3.3

Java or Scala Version

Java 11

Screenshots

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions