Skip to content

fix(column): return error instead of panic on Decimal/BigInt overflow#1857

Open
wucm667 wants to merge 5 commits into
ClickHouse:mainfrom
wucm667:fix/decimal-panic
Open

fix(column): return error instead of panic on Decimal/BigInt overflow#1857
wucm667 wants to merge 5 commits into
ClickHouse:mainfrom
wucm667:fix/decimal-panic

Conversation

@wucm667
Copy link
Copy Markdown
Contributor

@wucm667 wucm667 commented May 7, 2026

fix(column): return error instead of panic on Decimal/BigInt overflow

When inserting values that exceed the range of Decimal128/256 or Int128/UInt128/Int256/UInt256 columns, the driver previously panicked with math/big: buffer too small in bigIntToRaw(), killing the caller goroutine with no way to recover.

Decimal32/64 had a similar issue where overflow was silently truncated via IntPart() casts.

Changes

  • bigIntToRaw now uses deferred recover to catch FillBytes panics and returns a proper error
  • Added explicit signed range check: positive values exceeding 2^(n*8-1)-1 are caught before FillBytes (which treats values as unsigned and won't panic for these)
  • Decimal32/64 now check for overflow before silent IntPart() truncation
  • append() in both BigInt and Decimal now propagate errors through Append() and AppendRow()
  • Added unit tests for all four Decimal sizes and BigInt overflow / valid boundary cases

Before

// panic: math/big: buffer too small
col.AppendRow(bigIntOverflow)

After

err := col.AppendRow(bigIntOverflow)
// err: "value overflows 16-byte signed buffer"

Fixes #1849

When inserting values that exceed the range of Decimal128/256 or
Int128/UInt128/Int256/UInt256 columns, the driver previously panicked
with 'math/big: buffer too small' in bigIntToRaw(), killing the caller
goroutine with no way to recover.

Changes:
- bigIntToRaw now uses deferred recover to catch FillBytes panics and
  returns a proper error instead
- Added explicit signed range check: positive values exceeding
  2^(n*8-1)-1 are caught before FillBytes (which treats values as
  unsigned and won't panic for these)
- Decimal32/64 now check for overflow before silent IntPart() truncation
- append() in both BigInt and Decimal now propagate errors through
  Append() and AppendRow()
- Added unit tests for all four Decimal sizes and BigInt overflow/valid
  boundary cases

Fixes ClickHouse#1849

Signed-off-by: wucm667 <stevenwucongmin@gmail.com>
@github-actions
Copy link
Copy Markdown

Summary

The PR replaces a math/big panic with explicit error returns when inserting big-int / decimal values that overflow Int128/256, UInt128/256, and Decimal32/64/128/256 columns (fixes #1849). The approach is mostly sound: errors are threaded through Append / AppendRow, positive signed overflow is pre-checked via BitLen, Decimal32/64 no longer silently truncate via IntPart(), and a deferred recover catches any remaining FillBytes panic. However, the negative signed bounds check is missing, which leaves a wide range of values that silently produce wrong encodings rather than errors. Regression coverage also doesn't follow the project layout (tests/issues/).

Must fix

  • Negative signed overflow is silently corrupted, not caught. In lib/column/bigint.go the explicit overflow check only fires when v.Sign() >= 0. For negative values in (-2^n, -2^(n-1) - 1] (e.g. Int128 values from -2^128 + 1 down to -2^127 - 1), new(big.Int).Not(v).FillBytes(dest) succeeds — BitLen(Not(v)) == n exactly fits n bytes — so neither the explicit check nor the recover catches it. Concretely: appending -2^127 - 1 to an Int128 silently encodes as 2^127 - 1. Add a symmetric check on Not(v):

    if signed && v.Sign() < 0 {
        if new(big.Int).Not(v).BitLen() > len(dest)*8-1 {
            return fmt.Errorf(\"value overflows %d-byte signed buffer\", len(dest))
        }
    }

    This catches everything below -2^(n-1) cleanly (including -2^(n-1) - 1), while -2^(n-1) (the valid minimum) has Not(v).BitLen() == n-1 and passes.

  • Missing regression test in tests/issues/. Per .claude/CLAUDE.md, bug-fix regression tests live in tests/issues/issue_<N>_test.go and exercise a real ClickHouse instance via testcontainers. The PR adds only lib/column/decimal_overflow_test.go, which tests the column primitive in isolation and doesn't verify that Batch.Append / Batch.AppendStruct actually returns the error (rather than panicking) at the public API boundary. Add tests/issues/1849_test.go covering: native (clickhouse.Open) + database/sql (OpenDB), TCP + HTTP protocols, and at least one negative-overflow case (which currently silently corrupts — see above).

Should fix

  • Decimal.append and BigInt.append wrap errors inconsistently. BigInt.append returns a ColumnConverterError{Op: \"Append\", To: chType, From: \"big.Int\", Hint: err.Error()}, but Decimal.append returns a bare fmt.Errorf(\"value overflow: %s exceeds Decimal128 range\", v.String()). The driver elsewhere relies on ColumnConverterError for structured failure info. Also, ColumnConverterError's message template is \"converting X to Y is unsupported\" — accurate for type mismatches, but slightly misleading for an overflow on a supported conversion. Consider either using ColumnConverterError consistently in both, or introducing a dedicated overflow error and using it in both places.

  • recover() used for normal control flow. bigIntToRaw defers a bare recover() that converts any panic to \"value overflows %d-byte buffer\". If a future change panics for an unrelated reason inside the function, the user will get a confusing "overflows" error instead. Once the negative bounds check above is in place, every overflow case is caught by the explicit BitLen guards and the recover becomes unnecessary — prefer removing it. If kept defensively, at least type-assert the recovered value and only translate the expected \"math/big: buffer too small\" message.

  • Tests miss the silent-corruption window. TestBigIntValidValuesNoError covers ±2^127 (valid boundary) but no test asserts an error for the just-past-min case (e.g. v = -2^127 - 1). Add the negative-overflow case alongside the positive one — without it, the must-fix issue above will silently regress in the future.

  • Protocol / API coverage. The checklist requires test coverage on both native TCP and HTTP, and on both the clickhouse_native and std surfaces. The new tests are column-level unit tests only.

Nits

  • nit: lib/column/decimal_overflow_test.go uses decimal.NewFromFloat(21474836.48) / decimal.NewFromFloat(92233720368547758.08)float64 cannot represent those values exactly, so the test depends on rounding direction. Prefer decimal.NewFromString(\"21474836.48\") for boundary cases.
  • nit: lib/column/decimal.go Decimal32 branch — i64 := bi.Int64(); ... uint32(i64) is a redundant intermediate; uint32(bi.Int64()) is enough.
  • nit: error messages "value overflow: %s exceeds Decimal128 range" mix a colon with prose and use the ClickHouse type name unquoted. Per project style (CLAUDE.md), ClickHouse SQL type names in prose are wrapped in backticks; for error strings consider \"value %s overflows decimal128 range\" (lowercase, no colon, consistent with the rest of the package).
  • nit: lib/column/bigint.go:182 still wraps the value as new(big.Int).Set(v) before passing to bigIntToRaw. With the new implementation already using new(big.Int).Not(v) (no mutation), the defensive copy is unnecessary on the positive path.

Verdict

Request changes — the negative-overflow silent-corruption case (Must fix #1) needs to land before merge; the rest are smaller cleanups.

wucm667 and others added 3 commits May 11, 2026 21:05
- Add negative signed overflow check in bigIntToRaw: values in
  (-2^n, -2^(n-1)-1] no longer silently encode incorrectly
- Remove defer/recover since explicit BitLen guards now catch
  all overflow cases
- Remove unnecessary new(big.Int).Set(v) defensive copy
- Fix error message style: lowercase, no colon, consistent
  with project conventions
- Use NewFromString instead of NewFromFloat for exact boundary
  test values
- Add TestBigIntNegativeOverflowReturnsError
- Add regression test tests/issues/issue_1849_test.go

Signed-off-by: wucm667 <stevenwucongmin@gmail.com>
The regression test called batch.AppendRow which does not exist on the
Batch interface (it belongs to BatchColumn), causing a build failure
across all CI matrix jobs.

Rewrite the test to:
- Use batch.Append (the correct Batch method)
- Cover both native TCP and HTTP protocols
- Cover the database/sql surface via Begin/Prepare/Exec
- Test positive and negative overflow for both Decimal128 and Int128
@github-actions
Copy link
Copy Markdown

Summary

Replaces a panic (math/big: buffer too small from FillBytes) and silent truncation in the Decimal/BigInt column writers with explicit range checks. bigIntToRaw now validates the value fits the destination width before calling FillBytes, and Decimal32/Decimal64 paths verify the scaled coefficient before the lossy int64→uint32/uint64 cast. Errors propagate up through appendAppend/AppendRow. The approach is sound — preventing the panic with upfront bit-length checks is cleaner than defer recover(). There is also a latent-bug fix: the previous bigIntToRaw called v.Not(v).FillBytes(dest), mutating the caller's *big.Int; the workaround in BigInt.append (new(big.Int).Set(v)) is no longer needed because bigIntToRaw now uses new(big.Int).Not(v) internally.

Should fix

  • PR description is out of sync with the code. The description says bigIntToRaw "now uses deferred recover to catch FillBytes panics," but the implementation uses range checks instead (which is the better choice). Update the description so reviewers and the merged commit history don't claim behavior that isn't there.
  • Inconsistent error wrapping between BigInt and Decimal paths. BigInt.append wraps overflow errors in &ColumnConverterError{Op, To, From, Hint}; Decimal.append returns a bare fmt.Errorf("value %s overflows decimal128 range", ...). Callers that type-assert to *ColumnConverterError (a public, documented type) will get structured info from one path and a flat string from the other. Either wrap the Decimal errors in ColumnConverterError too, or %w-wrap the inner bigIntToRaw error so errors.Is/As works. Including the offending value (as the Decimal path does via v.String()) is a nice touch — consider doing the same for BigInt.
  • No test coverage for the unsigned (UInt128/UInt256) branches you added. bigIntToRaw grew two new code paths — "negative value not allowed for unsigned type" and the unsigned overflow at BitLen() > bits. Neither is exercised by decimal_overflow_test.go or issue_1849_test.go (only Int128 is tested). Please add cases for UInt128 negative input and 2^128 to lock in the behavior.
  • "negative value not allowed for unsigned type" is missing context. Compared to the sibling messages, this one doesn't say which type or width. Suggest fmt.Errorf("negative value %s not allowed for unsigned %d-byte type", v.String(), len(dest)) so the message is actionable on its own. Per CLAUDE.md, user-facing errors should be as actionable as possible.

Nits

  • nit: error messages reference decimal128/decimal256/decimal32/decimal64 in lowercase. The ClickHouse type names are `Decimal32`, `Decimal128`, etc. — consider matching the canonical casing in the error string.
  • nit: lib/column/decimal.go:255-265 — for Decimal32/Decimal64 you could route through bigIntToRaw with len(dest)==4/8 instead of an ad-hoc bi.IsInt64() && bi.Int64() ∈ [MinInt32, MaxInt32] check, removing the only reason to import math. Not required, just less code path divergence.
  • nit: tests/issues/issue_1849_test.go:33-40overflow1849Cases is a function that returns an anonymous-struct slice and takes *testing.T only to call t.Helper(). A package-level var with a named struct type would be lighter; the helper isn't doing helper-style work.
  • nit: lib/column/decimal_overflow_test.go constructs &BigInt{size: 16, chType: \"Int128\", signed: true, col: &proto.ColInt128{}} by hand. The constructor in lib/column/column.go (Type.Column) already builds these — using it would catch field-name drift if BigInt's struct ever changes.

Verdict

Request changes — small set of fixes (out-of-sync PR description, inconsistent error wrapping across Decimal/BigInt, missing unsigned test coverage). Core approach is correct and worth merging once the items above are addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Panic on overflow in Decimal128/256 and BigInt append (+ silent truncation for Decimal32/64)

3 participants