Commit 5d37ef3
schema: adopt Decimal and BigDecimal common types across sources and converters (#4358)
* schema: adopt Decimal and BigDecimal common types across CDC sources and converters
Threads benthos's new Decimal and BigDecimal common-schema types end-to-end:
- Five CDC sources (postgres, mysql, mssqlserver, oracledb, mongodb) now emit
Decimal(p, s) when precision and scale are declared and BigDecimal when
they are not, with values normalised to canonical decimal strings via a
new internal/sqlutil canonicaliser.
- Four format converters (iceberg, parquet, avro, json schema) honour
Decimal natively. BigDecimal is rejected by the bounded-format encoders
with an actionable error and accepted by JSON Schema as a permissive
string-with-pattern.
- ecs_avro detects logicalType: decimal in Avro specs and the
schema_registry_decode store_schema_metadata path normalises decoded
big.Rat values to canonical strings.
- Shared Parquet decimal-byte helpers extracted into
internal/impl/parquet/parquetdecimal so the parquet encoder and the
iceberg shredder no longer carry duplicate implementations.
The adoption is wired through a temporary go.mod replace directive
pointing at the local benthos checkout while a tagged release is
prepared; that directive is the one remaining follow-up before merge.
* oracledb, mssqlserver: fix decimal value-shape regressions surfaced by integration tests
Two integration-test failures pinned during PR-readiness verification:
1. Oracle bare NUMBER columns (no declared precision and scale) were
routed through the Decimal canonicaliser because go-ora's
*sql.ColumnType.DecimalSize() reports (precision=38, scale=255, ok=true)
for them — 255 is the driver's "any-scale" sentinel. The snapshot
mapper treated that as a real (p, s) and called Decimal(38, 255),
producing "decimal value has 255 significant digits" errors. The
oracleNumberToCommon schema mapping had the same hole. Both now treat
scale > precision as undeclared and fall back to BigDecimal so the
schema cache and the value mapper agree, leaving the source lossless.
2. MSSQL CDC streaming scanned DECIMAL/NUMERIC columns into *any, which
go-mssqldb coerced to a lossy float64. The streaming iterator now
pre-allocates *sql.NullString scan targets for DECIMAL/NUMERIC and
MONEY/SMALLMONEY so the driver hands back the lossless text
representation. The stream-snapshot code path in replication/snapshot.go
was also still wrapping DECIMAL/NUMERIC values in json.Number from the
pre-Decimal era; it now routes through sqlutil.CanonicaliseDecimal /
CanonicaliseBigDecimal in line with the regular snapshot and streaming
paths.
Improves the snapshot mapper's error message in oracledb to include the
column name and input text so future driver quirks are easier to spot,
and updates the streaming-block fixture for NOLEADINGZERO_COL in the
oracledb all-types integration test (previously asserted as a float64,
now a canonical BigDecimal string).
* chore: clear task lint and task test gates
Three small fixes to bring the repo to a clean lint/test state:
- internal/impl/confluent/ecs_avro.go: drop the unused ecsAvroFromBytes
wrapper (callers were migrated to ecsAvroParseFromBytes during the
decoder normalisation work) and replace a perfsprint-flagged
fmt.Errorf("missing") with errors.New.
- internal/impl/postgresql/pglogicalstream/schema_test.go: simplify the
redundant "((1 << 16) | 0) + 4" atttypmod fixture to "(1 << 16) + 4"
per staticcheck's SA4016.
- internal/impl/tigerbeetle/integration_test.go: switch the docker
container types import from "github.com/docker/docker/api/types/container"
to "github.com/moby/moby/api/types/container" to match the signature
testcontainers-go now expects (used elsewhere in this repo). The
docker module relocated these types upstream; the existing import
was a pre-existing typecheck failure that golangci-lint surfaces.
* Pin main for benthos (temporary)
* review: address PR feedback on ecs_avro scale default and decimal canonicaliser rounding
Three review fixes:
- internal/impl/confluent/ecs_avro.go: per the Avro spec scale is
optional in the decimal logical type and defaults to 0 when absent.
The reverse-direction reader was treating a missing scale as an error;
it now returns Decimal(precision, 0) for those fields. (joseph.woodward)
- internal/sqlutil/decimal.go: replace the big.Float fallback with a
big.Rat parse and an exact "fits at the declared scale" check.
Previously an input like "1.56789" against a NUMBER(10, 2) column
silently rounded to "1.57" because the big.Float path added 0.5 and
truncated; rationals represent decimals exactly, so the check is now
a real precision-loss test. Inputs that lose precision at the
declared scale return an error. Scientific notation, leading +, and
fewer-than-scale fractional digits continue to canonicalise as
before. (joseph.woodward)
- License URLs in internal/sqlutil/decimal.go and decimal_test.go:
drop the erroneous "/v4" segment to match the rest of the RCL
headers in the repo. (claude[bot])
Adds tests for both the scale-default Avro spec and the
precision-loss rejection.
* address unsigned decimal in mysql
* address special NUMERIC values in PostgreSQL
* oracledb: update streaming integration fixtures for BigDecimal value shape
TestIntegrationOracleDBCDCStreaming uses tables with bare NUMBER columns
(id NUMBER GENERATED ALWAYS AS IDENTITY and val NUMBER), both of which
now fall through to BigDecimal under the new schema mapping. Their
values are emitted as canonical decimal strings rather than json.Number
integers, so the per-subtest content assertions move from
{"ID":1,"VAL":1} / {"ID":1,"VAL":2} to {"ID":"1","VAL":"1"} /
{"ID":"1","VAL":"2"}.
* update failing test
---------
Co-authored-by: Joseph Woodward <joseph.woodward@xeuse.com>1 parent 6872dd7 commit 5d37ef3
41 files changed
Lines changed: 2031 additions & 252 deletions
File tree
- internal
- impl
- confluent
- iceberg
- icebergx
- shredder
- mongodb/cdc
- mssqlserver
- replication
- mysql
- oracledb
- replication
- parquet
- parquetdecimal
- postgresql
- pglogicalstream
- tigerbeetle
- sqlutil
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
26 | 36 | | |
27 | 37 | | |
28 | 38 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
144 | 144 | | |
145 | 145 | | |
146 | 146 | | |
147 | | - | |
| 147 | + | |
148 | 148 | | |
149 | 149 | | |
150 | 150 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1553 | 1553 | | |
1554 | 1554 | | |
1555 | 1555 | | |
1556 | | - | |
1557 | | - | |
| 1556 | + | |
| 1557 | + | |
1558 | 1558 | | |
1559 | 1559 | | |
1560 | 1560 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
79 | 91 | | |
80 | 92 | | |
81 | 93 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
161 | 203 | | |
162 | 204 | | |
163 | 205 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
62 | 75 | | |
63 | 76 | | |
64 | 77 | | |
| |||
128 | 141 | | |
129 | 142 | | |
130 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
| |||
132 | 133 | | |
133 | 134 | | |
134 | 135 | | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
0 commit comments