Skip to content

Commit 92b4a36

Browse files
committed
Reject invalid with_array_type registrations
with_array_type is now partial. It errors if the registration would cause recursive decode (element OID is itself an array OID), collide with an existing scalar or built-in array OID, use an OID already registered as a custom element OID, or self-reference. Previously these misconfigurations were silently accepted and would cause incorrect behavior or stack overflow at decode time. Closes #168
1 parent 6a6dcd3 commit 92b4a36

File tree

7 files changed

+109
-9
lines changed

7 files changed

+109
-9
lines changed

.release-notes/next-release.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -794,7 +794,7 @@ session.execute(PreparedQuery("SELECT $1::int4[]",
794794
```pony
795795
let registry = CodecRegistry
796796
.with_codec(600, PointBinaryCodec)
797-
.with_array_type(1017, 600)
797+
.with_array_type(1017, 600)?
798798
```
799799

800800
Multi-dimensional arrays are not supported and will fall back to `String` (text format) or `RawBytes` (binary format).
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
## Reject invalid with_array_type registrations
2+
3+
`CodecRegistry.with_array_type()` is now partial. It errors if the registration would cause problems at decode time:
4+
5+
- `element_oid` is itself an array OID (built-in or custom), which would cause unbounded recursion during decode
6+
- `array_oid` collides with a registered scalar or built-in array OID
7+
- `array_oid` is already registered as a custom element OID
8+
- `array_oid == element_oid`
9+
10+
Previously, these misconfigurations were silently accepted and would cause incorrect behavior or stack overflow at decode time. Now the error surfaces at registry construction, where the OID values are visible in the source.
11+
12+
Before:
13+
14+
```pony
15+
let registry = CodecRegistry
16+
.with_codec(600, PointBinaryCodec)
17+
.with_array_type(1017, 600)
18+
```
19+
20+
After:
21+
22+
```pony
23+
let registry = CodecRegistry
24+
.with_codec(600, PointBinaryCodec)
25+
.with_array_type(1017, 600)?
26+
```

CLAUDE.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@ Only one operation is in-flight at a time. The queue serializes execution. `quer
131131
- `StreamingResultReceiver` interface (tag) — `pg_stream_batch(Session, Rows)`, `pg_stream_complete(Session)`, `pg_stream_failed(Session, (PreparedQuery | NamedPreparedQuery), (ErrorResponseMessage | ClientQueryError))`. Pull-based: session delivers batches via `pg_stream_batch`; client calls `fetch_more()` for the next batch or `close_stream()` to end early
132132
- `PipelineReceiver` interface (tag) — `pg_pipeline_result(Session, USize, Result)`, `pg_pipeline_failed(Session, USize, (PreparedQuery | NamedPreparedQuery), (ErrorResponseMessage | ClientQueryError))`, `pg_pipeline_complete(Session)`. Each query result/failure is delivered with its pipeline index. `pg_pipeline_complete` always fires last
133133
- `Codec` interface (val) — `format(): U16`, `encode(FieldDataTypes): Array[U8] val ?`, `decode(Array[U8] val): FieldData ?`. Wire format codec for a PostgreSQL type. Encode stays closed (`FieldDataTypes`), decode is open (`FieldData`). Built-in codecs are primitives (zero-allocation singletons)
134-
- `CodecRegistry` class (val) — maps OIDs to text and binary `Codec` instances. Immutable — `with_codec(oid, codec)` returns a new registry with the codec added or replacing an existing one. `with_array_type(array_oid, element_oid)` registers a custom array type mapping. `array_oid_for(element_oid)` returns the array OID (built-in + custom, 0 if unknown). Supports chaining: `CodecRegistry.with_codec(600, A).with_array_type(1017, 600)`. Default constructor populates all built-in codecs. `decode(oid, format, data)` is partial — returns `FieldData` for known OIDs, fallbacks for unknown OIDs (unknown text→`String`, unknown binary→`RawBytes`), and errors when a registered codec's `decode()` fails. Intercepts array OIDs before normal codec dispatch. `has_binary_codec(oid)` for format selection (includes array OIDs)
134+
- `CodecRegistry` class (val) — maps OIDs to text and binary `Codec` instances. Immutable — `with_codec(oid, codec)` returns a new registry with the codec added or replacing an existing one. `with_array_type(array_oid, element_oid)?` registers a custom array type mapping (partial — errors if element_oid is an array OID, array_oid collides with a registered OID, array_oid is already a custom element OID, or array_oid == element_oid). `array_oid_for(element_oid)` returns the array OID (built-in + custom, 0 if unknown). Supports chaining: `CodecRegistry.with_codec(600, A).with_array_type(1017, 600)?`. Default constructor populates all built-in codecs. `decode(oid, format, data)` is partial — returns `FieldData` for known OIDs, fallbacks for unknown OIDs (unknown text→`String`, unknown binary→`RawBytes`), and errors when a registered codec's `decode()` fails. Intercepts array OIDs before normal codec dispatch. `has_binary_codec(oid)` for format selection (includes array OIDs)
135135
- `ClientQueryError` — union type `(SessionNeverOpened | SessionClosed | SessionNotAuthenticated | DataError)`
136136
- `DatabaseConnectInfo` — val class grouping database authentication parameters (user, password, database). Passed to `Session.create()` alongside `ServerConnectInfo`.
137137
- `ServerConnectInfo` — val class grouping connection parameters (auth, host, service, ssl_mode). Passed to `Session.create()` as the first parameter. Also used by `_CancelSender`.
@@ -170,7 +170,7 @@ Codec-based decoding via `CodecRegistry.decode(oid, format_code, data)`. Extende
170170
- Text codecs (`_text_codecs.pony`): `_BoolTextCodec`, `_ByteaTextCodec`, `_Int2TextCodec`, `_Int4TextCodec`, `_Int8TextCodec`, `_Float4TextCodec`, `_Float8TextCodec`, `_DateTextCodec`, `_TimeTextCodec`, `_TimestampTextCodec`, `_TimestamptzTextCodec`, `_IntervalTextCodec` (supports all four `intervalstyle` formats: `postgres`, `postgres_verbose`, `iso_8601`, `sql_standard` — detected via heuristic in `decode()`), `_TextPassthroughTextCodec`, `_OidTextCodec`, `_NumericTextCodec`, `_UuidTextCodec`, `_JsonbTextCodec`
171171
- `_ArrayOidMap` (`_array_oid_map.pony`): static bidirectional mapping between element OIDs and array OIDs (23 entries). Methods: `element_oid_for(array_oid)`, `array_oid_for(element_oid)`, `is_array_oid(oid)`
172172
- `_ArrayEncoder` (`_array_encoder.pony`): encodes `PgArray` to binary array wire format. Dispatches element encoding on Pony runtime types. String elements routed by `element_oid`: uuid → `_UuidBinaryCodec`, jsonb → `_JsonbBinaryCodec`, oid → `_OidBinaryCodec`, numeric → `_NumericBinaryCodec`; all others → raw UTF-8 bytes. Coupling: element encoding must stay in sync with `_FrontendMessage.bind()` and `_binary_codecs.pony`
173-
- `CodecRegistry` (`codec_registry.pony`): maps OIDs to codecs. Adds `_custom_array_element_oids: Map[U32, U32] val` field for custom array type registrations. Default constructor populates all built-ins. `with_codec(oid, codec)` returns a new registry with the codec added/replaced. `with_array_type(array_oid, element_oid)` returns a new registry with the custom array mapping added. `array_oid_for(element_oid): U32` returns the array OID for the given element OID. `decode()` intercepts array OIDs before normal codec dispatch via two-phase array decoding: `_parse_binary_array`/`_parse_text_array` extract structure and raw element bytes (errors fall back), then `_decode_array_elements` decodes each element via the element codec (errors propagate). `has_binary_codec()` checks array OIDs too. `_with_codec` constructor (type-private) used internally by `with_codec`
173+
- `CodecRegistry` (`codec_registry.pony`): maps OIDs to codecs. Adds `_custom_array_element_oids: Map[U32, U32] val` field for custom array type registrations. Default constructor populates all built-ins. `with_codec(oid, codec)` returns a new registry with the codec added/replaced. `with_array_type(array_oid, element_oid)?` returns a new registry with the custom array mapping added (errors if element_oid is an array OID, array_oid collides with a registered OID, array_oid is already a custom element OID, or array_oid == element_oid). `array_oid_for(element_oid): U32` returns the array OID for the given element OID. `decode()` intercepts array OIDs before normal codec dispatch via two-phase array decoding: `_parse_binary_array`/`_parse_text_array` extract structure and raw element bytes (errors fall back), then `_decode_array_elements` decodes each element via the element codec (errors propagate). `has_binary_codec()` checks array OIDs too. `_with_codec` constructor (type-private) used internally by `with_codec`
174174
- `_ParamEncoder` (`_param_encoder.pony`): derives PostgreSQL OIDs from `FieldDataTypes` parameter values for Parse messages. Takes `registry: CodecRegistry` parameter; `PgArray` arm uses `registry.array_oid_for(a.element_oid)`
175175

176176
**Encode error handling:** `_FrontendMessage.bind()` is partial — it errors if parameter encoding fails. Takes `registry: CodecRegistry` parameter (with default `CodecRegistry`); pre-encodes `PgArray` parameters via `_ArrayEncoder` before the `recover val` block. `_QueryReady.try_run_query()` uses a build-before-transition pattern: wire messages are constructed before transitioning to an in-flight state, so encode errors deliver `DataError` to the receiver without leaving the state machine inconsistent. Pipeline queries build message parts into an `iso` array in `ref` scope (where error handling has full access to the session and receiver), then consume the array into a `recover val` block for concatenation.

postgres/_test.pony

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -446,6 +446,7 @@ actor \nodoc\ Main is TestList
446446
test(_TestNumericBinaryCodecEncodeRoundtrip)
447447
test(_TestCodecRegistryHasBinaryCodecArray)
448448
test(_TestCodecRegistryWithArrayType)
449+
test(_TestCodecRegistryWithArrayTypeRejectsInvalid)
449450
test(_TestCodecRegistryArrayOidFor)
450451
test(_TestIntegrationArraySelectBinary)
451452
test(_TestIntegrationArraySelectText)

postgres/_test_array.pony

Lines changed: 47 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -946,11 +946,56 @@ class \nodoc\ iso _TestCodecRegistryWithArrayType is UnitTest
946946
fun name(): String =>
947947
"CodecRegistry/WithArrayType"
948948

949-
fun apply(h: TestHelper) =>
950-
let registry = CodecRegistry.with_array_type(1017, 600)
949+
fun apply(h: TestHelper) ? =>
950+
let registry = CodecRegistry.with_array_type(1017, 600)?
951951
h.assert_true(registry.has_binary_codec(1017))
952952
h.assert_eq[U32](1017, registry.array_oid_for(600))
953953

954+
class \nodoc\ iso _TestCodecRegistryWithArrayTypeRejectsInvalid
955+
is UnitTest
956+
fun name(): String =>
957+
"CodecRegistry/WithArrayType/RejectsInvalid"
958+
959+
fun apply(h: TestHelper) =>
960+
// Built-in array OID as element_oid (would cause recursion)
961+
h.assert_error({()? => CodecRegistry.with_array_type(9999, 1007)? })
962+
963+
// Custom array OID as element_oid (would cause recursion)
964+
h.assert_error({()? =>
965+
CodecRegistry.with_array_type(9998, 600)?
966+
.with_array_type(9999, 9998)?
967+
})
968+
969+
// Self-referential mapping
970+
h.assert_error({()? => CodecRegistry.with_array_type(600, 600)? })
971+
972+
// array_oid collides with built-in scalar codec (int4 = OID 23)
973+
h.assert_error({()? => CodecRegistry.with_array_type(23, 600)? })
974+
975+
// array_oid collides with built-in array OID (int4[] = OID 1007)
976+
h.assert_error({()? => CodecRegistry.with_array_type(1007, 600)? })
977+
978+
// array_oid collides with custom binary-only codec
979+
h.assert_error({()? =>
980+
CodecRegistry
981+
.with_codec(9000, _TestPointCodec)
982+
.with_array_type(9000, 600)?
983+
})
984+
985+
// array_oid collides with custom text-only codec
986+
h.assert_error({()? =>
987+
CodecRegistry
988+
.with_codec(9001, _TestUppercaseTextCodec)
989+
.with_array_type(9001, 600)?
990+
})
991+
992+
// array_oid is already a custom element OID (would cause decode
993+
// to misinterpret element data as array structure)
994+
h.assert_error({()? =>
995+
CodecRegistry.with_array_type(9998, 600)?
996+
.with_array_type(600, 500)?
997+
})
998+
954999
class \nodoc\ iso _TestCodecRegistryArrayOidFor is UnitTest
9551000
fun name(): String =>
9561001
"CodecRegistry/ArrayOidFor"

postgres/codec_registry.pony

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ class val CodecRegistry
1212
```pony
1313
let registry = CodecRegistry
1414
.with_codec(600, PointBinaryCodec)
15-
.with_array_type(1017, 600)
15+
.with_array_type(1017, 600)?
1616
let session = Session(server_info, db_info, notify where registry = registry)
1717
```
1818
"""
@@ -124,13 +124,41 @@ class val CodecRegistry
124124
end
125125
end
126126

127-
fun val with_array_type(array_oid: U32, element_oid: U32): CodecRegistry =>
127+
fun val with_array_type(array_oid: U32, element_oid: U32)
128+
: CodecRegistry ?
129+
=>
128130
"""
129131
Returns a new registry with a custom array type mapping. This enables
130132
decode of arrays whose element type is a custom codec-registered OID.
131133
Supports chaining with `with_codec`:
132-
`CodecRegistry.with_codec(600, PointCodec).with_array_type(1017, 600)`.
134+
`CodecRegistry.with_codec(600, PointCodec).with_array_type(1017, 600)?`.
135+
136+
Errors if:
137+
- `element_oid` is itself an array OID (built-in or custom), which would
138+
cause unbounded recursion during decode
139+
- `array_oid` collides with a registered scalar or built-in array OID
140+
- `array_oid` is already registered as a custom element OID
141+
- `array_oid == element_oid`
133142
"""
143+
// Reject self-referential mapping
144+
if array_oid == element_oid then error end
145+
146+
// Reject element OIDs that are themselves array OIDs (recursion)
147+
if _ArrayOidMap.is_array_oid(element_oid) then error end
148+
if _custom_array_element_oids.contains(element_oid) then error end
149+
150+
// Reject array OIDs that collide with registered scalar codecs
151+
if _text_codecs.contains(array_oid) then error end
152+
if _binary_codecs.contains(array_oid) then error end
153+
154+
// Reject array OIDs that collide with built-in array OIDs
155+
if _ArrayOidMap.is_array_oid(array_oid) then error end
156+
157+
// Reject array OIDs that are already registered as custom element OIDs
158+
for elem_oid in _custom_array_element_oids.values() do
159+
if elem_oid == array_oid then error end
160+
end
161+
134162
CodecRegistry._with_array_type(this, array_oid, element_oid)
135163

136164
new val _with_array_type(base: CodecRegistry, array_oid: U32,

postgres/postgres.pony

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -344,7 +344,7 @@ For custom array types (arrays of custom codec-registered OIDs), use
344344
```pony
345345
let registry = CodecRegistry
346346
.with_codec(600, PointBinaryCodec)
347-
.with_array_type(1017, 600)
347+
.with_array_type(1017, 600)?
348348
```
349349
350350
## Custom Codecs

0 commit comments

Comments
 (0)