Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ Codec-based decoding via `CodecRegistry.decode(oid, format_code, data)`. Extende

**Text format codecs** (SimpleQuery path): Same type mappings as binary codecs. Unknown OIDs → `String`.

**Fallbacks and errors**: `CodecRegistry.decode()` is partial. For unknown OIDs (no registered codec), it returns fallbacks: `String` for text format, `RawBytes` for binary format. For known OIDs where the registered codec's `decode()` errors, the error propagates to the caller.
**Fallbacks and errors**: `CodecRegistry.decode()` is partial. For unknown OIDs (no registered codec), it returns fallbacks: `String` for text format, `RawBytes` for binary format. For known OIDs where the registered codec's `decode()` errors, the error propagates to the caller. For arrays, structural parsing errors (malformed wire format) fall back, but element codec errors propagate — this distinguishes unrecognized array formats from corrupt element data.

**Array OIDs** (element OID → array OID, 23 built-in mappings): 16→1000 (bool[]), 17→1001 (bytea[]), 18→1002 (char[]), 19→1003 (name[]), 20→1016 (int8[]), 21→1005 (int2[]), 23→1007 (int4[]), 25→1009 (text[]), 26→1028 (oid[]), 114→199 (json[]), 142→143 (xml[]), 700→1021 (float4[]), 701→1022 (float8[]), 1042→1014 (bpchar[]), 1043→1015 (varchar[]), 1082→1182 (date[]), 1083→1183 (time[]), 1114→1115 (timestamp[]), 1184→1185 (timestamptz[]), 1186→1187 (interval[]), 1700→1231 (numeric[]), 2950→2951 (uuid[]), 3802→3807 (jsonb[]). Arrays decoded via `CodecRegistry` interception (before per-OID codec dispatch). Binary format produces `PgArray`; text format produces `PgArray`.

Expand All @@ -170,7 +170,7 @@ Codec-based decoding via `CodecRegistry.decode(oid, format_code, data)`. Extende
- Text codecs (`_text_codecs.pony`): `_BoolTextCodec`, `_ByteaTextCodec`, `_Int2TextCodec`, `_Int4TextCodec`, `_Int8TextCodec`, `_Float4TextCodec`, `_Float8TextCodec`, `_DateTextCodec`, `_TimeTextCodec`, `_TimestampTextCodec`, `_TimestamptzTextCodec`, `_IntervalTextCodec` (supports all four `intervalstyle` formats: `postgres`, `postgres_verbose`, `iso_8601`, `sql_standard` — detected via heuristic in `decode()`), `_TextPassthroughTextCodec`, `_OidTextCodec`, `_NumericTextCodec`, `_UuidTextCodec`, `_JsonbTextCodec`
- `_ArrayOidMap` (`_array_oid_map.pony`): static bidirectional mapping between element OIDs and array OIDs (23 entries). Methods: `element_oid_for(array_oid)`, `array_oid_for(element_oid)`, `is_array_oid(oid)`
- `_ArrayEncoder` (`_array_encoder.pony`): encodes `PgArray` to binary array wire format. Dispatches element encoding on Pony runtime types. String elements routed by `element_oid`: uuid → `_UuidBinaryCodec`, jsonb → `_JsonbBinaryCodec`, oid → `_OidBinaryCodec`, numeric → `_NumericBinaryCodec`; all others → raw UTF-8 bytes. Coupling: element encoding must stay in sync with `_FrontendMessage.bind()` and `_binary_codecs.pony`
- `CodecRegistry` (`codec_registry.pony`): maps OIDs to codecs. Adds `_custom_array_element_oids: Map[U32, U32] val` field for custom array type registrations. Default constructor populates all built-ins. `with_codec(oid, codec)` returns a new registry with the codec added/replaced. `with_array_type(array_oid, element_oid)` returns a new registry with the custom array mapping added. `array_oid_for(element_oid): U32` returns the array OID for the given element OID. `decode()` intercepts array OIDs before normal codec dispatch. `has_binary_codec()` checks array OIDs too. `_with_codec` constructor (type-private) used internally by `with_codec`
- `CodecRegistry` (`codec_registry.pony`): maps OIDs to codecs. Adds `_custom_array_element_oids: Map[U32, U32] val` field for custom array type registrations. Default constructor populates all built-ins. `with_codec(oid, codec)` returns a new registry with the codec added/replaced. `with_array_type(array_oid, element_oid)` returns a new registry with the custom array mapping added. `array_oid_for(element_oid): U32` returns the array OID for the given element OID. `decode()` intercepts array OIDs before normal codec dispatch via two-phase array decoding: `_parse_binary_array`/`_parse_text_array` extract structure and raw element bytes (errors fall back), then `_decode_array_elements` decodes each element via the element codec (errors propagate). `has_binary_codec()` checks array OIDs too. `_with_codec` constructor (type-private) used internally by `with_codec`
- `_ParamEncoder` (`_param_encoder.pony`): derives PostgreSQL OIDs from `FieldDataTypes` parameter values for Parse messages. Takes `registry: CodecRegistry` parameter; `PgArray` arm uses `registry.array_oid_for(a.element_oid)`

**Encode error handling:** `_FrontendMessage.bind()` is partial — it errors if parameter encoding fails. Takes `registry: CodecRegistry` parameter (with default `CodecRegistry`); pre-encodes `PgArray` parameters via `_ArrayEncoder` before the `recover val` block. `_QueryReady.try_run_query()` uses a build-before-transition pattern: wire messages are constructed before transitioning to an in-flight state, so encode errors deliver `DataError` to the receiver without leaving the state machine inconsistent. Pipeline queries build message parts into an `iso` array in `ref` scope (where error handling has full access to the session and receiver), then consume the array into a `recover val` block for concatenation.
Expand Down
2 changes: 2 additions & 0 deletions postgres/_test.pony
Original file line number Diff line number Diff line change
Expand Up @@ -418,6 +418,8 @@ actor \nodoc\ Main is TestList
test(_TestBinaryDecodeWithNulls)
test(_TestBinaryDecodeEmptyArray)
test(_TestBinaryDecodeValidationErrors)
test(_TestBinaryArrayElementCodecErrorPropagates)
test(_TestTextArrayElementCodecErrorPropagates)
test(_TestTextDecodeSimpleArray)
test(_TestTextDecodeNullArray)
test(_TestTextDecodeQuotedArray)
Expand Down
34 changes: 34 additions & 0 deletions postgres/_test_array.pony
Original file line number Diff line number Diff line change
Expand Up @@ -371,10 +371,44 @@ class \nodoc\ iso _TestBinaryDecodeValidationErrors is UnitTest
h.fail("Expected fallback for truncated data")
end

class \nodoc\ iso _TestBinaryArrayElementCodecErrorPropagates is UnitTest
"""
A structurally valid binary int4 array where one element has only 2 bytes
(int4 expects 4). The structural parse succeeds but the element codec
errors — that error must propagate rather than falling back to RawBytes.
"""
fun name(): String =>
"Codec/Binary/Array/ElementCodecErrorPropagates"

fun apply(h: TestHelper) =>
let registry = CodecRegistry
// Build a binary int4[] with a 2-byte element (int4 requires 4 bytes)
let elems: Array[(Array[U8] val | None)] val = recover val
[as (Array[U8] val | None):
_TestArrayBinaryBuilder.int4_bytes(1)
recover val [as U8: 0xAB; 0xCD] end] // only 2 bytes — int4 codec error
end
let data = _TestArrayBinaryBuilder(23, elems)
h.assert_error({()? => registry.decode(1007, 1, data)? })

// ============================================================
// Text decode tests
// ============================================================

class \nodoc\ iso _TestTextArrayElementCodecErrorPropagates is UnitTest
"""
A text int4 array containing "abc" — structurally valid but the int4 text
codec cannot parse "abc". The error must propagate rather than falling back
to String.
"""
fun name(): String =>
"Codec/Text/Array/ElementCodecErrorPropagates"

fun apply(h: TestHelper) =>
let registry = CodecRegistry
let data: Array[U8] val = recover val "{abc}".array() end
h.assert_error({()? => registry.decode(1007, 0, data)? })

class \nodoc\ iso _TestTextDecodeSimpleArray is UnitTest
fun name(): String =>
"Codec/Text/Array/Simple"
Expand Down
109 changes: 64 additions & 45 deletions postgres/codec_registry.pony
Original file line number Diff line number Diff line change
Expand Up @@ -179,40 +179,34 @@ class val CodecRegistry
to the caller. This surfaces malformed data from the server (built-in
codecs) and broken custom codecs instead of silently returning fallback
values.

For arrays, structural parsing errors (malformed wire format) fall back,
but element codec errors propagate. This distinguishes between an
unrecognized array format (which might be a new PostgreSQL feature) and
corrupt element data (which should be surfaced).
"""
// Check built-in array OIDs
if _ArrayOidMap.is_array_oid(oid) then
try
if format == 0 then
return _decode_text_array(oid, data)?
else
return _decode_binary_array(data)?
end
let parsed = try
if format == 0 then _parse_text_array(oid, data)?
else _parse_binary_array(data)? end
else
// Malformed array data falls through to fallback
if format == 0 then
return String.from_array(data)
else
return RawBytes(data)
end
if format == 0 then return String.from_array(data)
else return RawBytes(data) end
end
return _decode_array_elements(parsed._1, format, parsed._2)?
end

// Check custom array OIDs
if _custom_array_element_oids.contains(oid) then
try
if format == 0 then
return _decode_text_array(oid, data)?
else
return _decode_binary_array(data)?
end
let parsed = try
if format == 0 then _parse_text_array(oid, data)?
else _parse_binary_array(data)? end
else
if format == 0 then
return String.from_array(data)
else
return RawBytes(data)
end
if format == 0 then return String.from_array(data)
else return RawBytes(data) end
end
return _decode_array_elements(parsed._1, format, parsed._2)?
end

if format == 0 then
Expand Down Expand Up @@ -240,9 +234,13 @@ class val CodecRegistry
or _ArrayOidMap.is_array_oid(oid)
or _custom_array_element_oids.contains(oid)

fun _decode_binary_array(data: Array[U8] val): PgArray ? =>
fun _parse_binary_array(data: Array[U8] val)
: (U32, Array[(Array[U8] val | None)] val) ?
=>
"""
Decode binary array wire format into PgArray.
Parse binary array wire format, extracting the element OID and raw element
byte slices without decoding them. Structural validation errors (truncated
data, multi-dimensional, bad offsets) raise `error`.
"""
if data.size() < 12 then error end
let ndim = ifdef bigendian then
Expand All @@ -267,8 +265,8 @@ class val CodecRegistry
end

if ndim == 0 then
return PgArray(element_oid,
recover val Array[(FieldData | None)] end)
return (element_oid,
recover val Array[(Array[U8] val | None)] end)
end

if data.size() < 20 then error end
Expand All @@ -285,8 +283,8 @@ class val CodecRegistry
dim_size.usize().mul_partial(4)?
if (20 + min_element_bytes) > data.size() then error end

let elements = recover iso
let elems = Array[(FieldData | None)](dim_size.usize())
let raw_elements = recover val
let elems = Array[(Array[U8] val | None)](dim_size.usize())
var offset: USize = 20
var i: I32 = 0
while i < dim_size do
Expand All @@ -304,22 +302,24 @@ class val CodecRegistry
else
let len = elem_len.usize()
if (offset + len) > data.size() then error end
let elem_data: Array[U8] val = recover val data.trim(offset, offset + len) end
elems.push(decode(element_oid, 1, elem_data)?)
elems.push(recover val data.trim(offset, offset + len) end)
offset = offset + len
end
i = i + 1
end
if offset != data.size() then error end
elems
end
PgArray(element_oid, consume elements)
(element_oid, raw_elements)

fun _decode_text_array(array_oid: U32, data: Array[U8] val): PgArray ? =>
fun _parse_text_array(array_oid: U32, data: Array[U8] val)
: (U32, Array[(Array[U8] val | None)] val) ?
=>
"""
Decode text array format into PgArray. Handles simple elements, quoted
elements with backslash escaping, NULL, and empty arrays. Rejects
multi-dimensional arrays.
Parse text array format, extracting the element OID and raw element byte
arrays without decoding them. Handles simple elements, quoted elements with
backslash escaping, NULL, and empty arrays. Rejects multi-dimensional
arrays.
"""
let s: String val = String.from_array(data)
if s.size() < 2 then error end
Expand All @@ -338,15 +338,15 @@ class val CodecRegistry

// Empty array
if s.size() == 2 then
return PgArray(element_oid,
recover val Array[(FieldData | None)] end)
return (element_oid,
recover val Array[(Array[U8] val | None)] end)
end

// Check for multi-dimensional array
if s(1)? == '{' then error end

let elements: Array[(FieldData | None)] val = recover val
let elems = Array[(FieldData | None)]
let raw_elements: Array[(Array[U8] val | None)] val = recover val
let elems = Array[(Array[U8] val | None)]
var pos: USize = 1 // skip opening '{'
let end_pos = s.size() - 1 // before closing '}'

Expand All @@ -370,8 +370,7 @@ class val CodecRegistry
end
if pos >= end_pos then error end
pos = pos + 1 // skip closing '"'
let raw: Array[U8] val = consume buf
elems.push(decode(element_oid, 0, raw)?)
elems.push(consume buf)
else
// Unquoted element — read until ',' or end
let start = pos
Expand All @@ -388,8 +387,7 @@ class val CodecRegistry
then
elems.push(None)
else
let raw: Array[U8] val = token.array()
elems.push(decode(element_oid, 0, raw)?)
elems.push(token.array())
end
end

Expand All @@ -400,4 +398,25 @@ class val CodecRegistry
end
elems
end
PgArray(element_oid, elements)
(element_oid, raw_elements)

fun _decode_array_elements(element_oid: U32, format: U16,
raw_elements: Array[(Array[U8] val | None)] val): PgArray ?
=>
"""
Decode raw element byte arrays into typed `FieldData` values using the
registered codec for the element OID. Errors from element codec `decode()`
propagate to the caller.
"""
let elements = recover iso
let elems = Array[(FieldData | None)](raw_elements.size())
for raw in raw_elements.values() do
match raw
| None => elems.push(None)
| let bytes: Array[U8] val =>
elems.push(decode(element_oid, format, bytes)?)
end
end
elems
end
PgArray(element_oid, consume elements)
Loading