Skip to content

Bug: Serialize.writeInt64() truncates Long to Int causing data corruption #98

@jeremysong

Description

@jeremysong

Description

The Serialize.writeInt64() method in com.clickhouse.utils.Serialize incorrectly truncates Long values to 32-bit integers before writing them as 64-bit values. This causes any Int64 value larger than Integer.MAX_VALUE (2,147,483,647) to be corrupted when inserted using RowBinary format.

Steps to reproduce

  1. Create a ClickHouse table with a Nullable(Int64) column
  2. Use Serialize.writeInt64() to write a value larger than 2,147,483,647 (e.g., timestamp 1704067200000L)
  3. Insert the data using RowBinary format
  4. Query the data back
  5. The value is corrupted (e.g., 1704067200000 becomes -1034816512)

Error Log or Exception StackTrace

No exception thrown - data is silently corrupted.
Expected: 1704067200000
Actual: -1034816512

Expected Behaviour

Serialize.writeInt64() should correctly write the full 64-bit Long value without truncation. Values like 1704067200000L should be stored and retrieved correctly.

Code Example

// Bug in Serialize.java line 208-212
public static void writeInt64(OutputStream out, Long value, boolean defaultsSupport, 
                             boolean isNullable, ClickHouseDataType dataType, 
                             boolean hasDefault, String column) throws IOException {
   if (writeValuePreamble(out, defaultsSupport, isNullable, dataType, hasDefault, column, value)) {
       BinaryStreamUtils.writeInt64(out, convertToInteger(value));  // BUG: Should be convertToLong(value)
   }
}

// convertToInteger() truncates Long to Int
public static Integer convertToInteger(Object value) {
   if (value instanceof Number) {
       return ((Number) value).intValue();  // Loses upper 32 bits!
   }
   ...
}

Root Cause: Line 210 calls convertToInteger(value) which truncates the 64-bit Long to a 32-bit Integer using .intValue(). The truncated value is then sign-extended when written as 64 bits.

Example:

  • Input: 1704067200000L (0x0000018C_C251F400)
  • After convertToInteger(): -1034816512 (0xC251F400 as signed 32-bit)
  • After sign-extension to 64-bit: 0xFFFFFFFF_C251F400 (-1034816512 as signed 64-bit)

Fix: Change line 210 to use convertToLong(value) instead:

BinaryStreamUtils.writeInt64(out, convertToLong(value));

Configuration

Client Configuration

ClickHouseClientConfig config = new ClickHouseClientConfig(
   "http://127.0.0.1:8123",
   "default",
   "",
   "default",
   "test_table"
);
Client client = config.createClient();

Environment

  • Cloud
  • Connector version: flink-connector-clickhouse-base (latest)
  • Language version: Java 11
  • OS: macOS / Linux

ClickHouse Server

  • ClickHouse Server version: 24.x
  • ClickHouse Server non-default settings, if any: None
  • CREATE TABLE statements for tables involved:
CREATE TABLE test_int64 (
   id String,
   timestamp_value Nullable(Int64)
) ENGINE = MergeTree() ORDER BY id
  • Sample data: Any Int64 value > 2,147,483,647 will reproduce the issue (e.g., millisecond timestamps like 1704067200000)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions