Skip to content

Commit 6719a84

Browse files
authored
Add blog post: Pony's Postgres Driver Grows Up (#1303)
Closes #1302
1 parent fa9ec31 commit 6719a84

File tree

1 file changed

+233
-0
lines changed

1 file changed

+233
-0
lines changed
Lines changed: 233 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,233 @@
1+
---
2+
date: 2026-04-09T20:00:00-04:00
3+
title: "Pony's Postgres Driver Grows Up"
4+
authors:
5+
- seantallen
6+
categories:
7+
- Libraries
8+
draft: false
9+
---
10+
11+
Pony has had a Postgres driver since 2023. It didn't do much for most of those three years. That's been changing over the last couple of months.
12+
13+
<!-- more -->
14+
15+
## Where it was
16+
17+
[`ponylang/postgres`](https://github.com/ponylang/postgres) shipped its 0.1.0 in February 2023. It implemented just enough of the Postgres wire protocol to authenticate with MD5 and run a few simple queries. Then it sat. I didn't touch it for a couple years. I eventually landed 0.1.1 last year, and the 0.2.x line that followed was three dependency bumps with no functional changes.
18+
19+
This week, I landed 0.3.0. The rest of this post is what came with it.
20+
21+
## Why now
22+
23+
[Last month I wrote about `ponylang/lori`](pony-networking-take-two.md), the networking foundation Pony's web stack is being built on. [`ponylang/stallion`](stallion-http-server.md) gave us an HTTP server. [`ponylang/hobby`](https://github.com/ponylang/hobby) put a web framework on top of Stallion. [`ponylang/json`](pony-gets-a-new-json-library.md) handles JSON parsing and serialization. Each one of those is a piece of the same project: making Pony a language you can write web services in without having to build half the stack yourself first.
24+
25+
A web stack without a database driver isn't a web stack. It's a one-legged dog trying to compete in a three-legged race. `ponylang/postgres` 0.3.0 is part of fixing that.
26+
27+
Most of 0.3.0 sits on [`ponylang/lori`](https://github.com/ponylang/lori). Let me walk you through the driver.
28+
29+
## Connecting to a real Postgres install
30+
31+
Auth is automatic. The driver handles SCRAM-SHA-256, MD5, and cleartext password, and picks whichever one the server asks for.
32+
33+
Encryption is the part you pick. `SSLDisabled` is the default. `SSLRequired` demands encryption and fails if the server refuses. `SSLPreferred` tries SSL and falls back to plaintext if the server refuses it. Both encrypted modes use lori's `start_tls()` for the mid-stream handshake the Postgres protocol requires.
34+
35+
## Real queries
36+
37+
The simplest way to run a query is `SimpleQuery`: hand it a SQL string, get results back. That's fine for ad-hoc queries and control statements like `BEGIN`. It's not fine for anything that includes user input, because you'd be building the SQL with string concatenation and earning SQL injection as a reward.
38+
39+
`PreparedQuery` is how you bind values separately from the query text. You write the query with placeholders (`$1`, `$2`, and so on), hand the values in as a typed array, and the driver sends them across the wire in binary format. The server never sees a glued-together string:
40+
41+
```pony
42+
let query = PreparedQuery("SELECT * FROM users WHERE id = $1",
43+
recover val [as FieldDataTypes: I32(42)] end)
44+
session.execute(query, receiver)
45+
```
46+
47+
If you're running the same query a lot with different values, you don't want the server parsing it from scratch every time. You want it parsed and planned once, then executed with different values. That's a named prepared statement, and `NamedPreparedQuery` is how you use it. Hand the server a name and the query text via `session.prepare()`, then fire off executions by name for as long as you want. When you're done, `session.close_statement()` cleans up the server-side state.
48+
49+
```pony
50+
session.prepare("find_user", "SELECT * FROM users WHERE id = $1", receiver)
51+
52+
// In the PrepareReceiver callback:
53+
be pg_statement_prepared(session: Session, name: String) =>
54+
session.execute(
55+
NamedPreparedQuery("find_user",
56+
recover val [as FieldDataTypes: I32(42)] end),
57+
result_receiver)
58+
```
59+
60+
And if you've got a batch of queries to send that don't depend on each other's results, you can ship them together. `session.pipeline()` writes all of them to the socket in one go. The server works through them in order, and each result comes back indexed by its position in the batch. N round trips collapse into one. Each query has its own error isolation, so a failure in the middle doesn't poison the rest.
61+
62+
## Transactions
63+
64+
Transactions are first-class. `BEGIN`, do work, `COMMIT` or `ROLLBACK`, all through the normal `execute()` interface:
65+
66+
```pony
67+
be pg_session_authenticated(session: Session) =>
68+
session.execute(SimpleQuery("BEGIN"), receiver)
69+
session.execute(SimpleQuery("INSERT INTO t (col) VALUES ('x')"), receiver)
70+
session.execute(SimpleQuery("COMMIT"), receiver)
71+
```
72+
73+
The catch with transactions is that once a query inside one errors, Postgres marks the whole transaction as failed and rejects every subsequent statement until you `ROLLBACK`. You need to know you're in that state so you can bail out cleanly instead of sending a dozen more queries the server is just going to reject. The `pg_transaction_status` callback on `SessionStatusNotify` fires after every query cycle and tells you whether the session is idle, inside a transaction, or inside a failed one.
74+
75+
## Cancellations and timeouts
76+
77+
Sometimes a query is going to run longer than you want, and you'd like to stop it. `session.cancel()` sends a Postgres `CancelRequest` on a separate connection (the protocol requires a separate connection) and asks the server to abort whatever's currently running. It's best-effort; the server may or may not honor it. If it does, the cancelled query's `ResultReceiver` gets `pg_query_failed` with SQLSTATE `57014` (`query_canceled`).
78+
79+
Manually cancelling every slow query is a lot of bookkeeping. Every query operation takes an optional `statement_timeout`, and the driver handles the bookkeeping for you. It starts a one-shot timer when the query goes out, and if the query isn't done by then, the driver fires off the `CancelRequest` itself. Same SQLSTATE, same code path.
80+
81+
And if the server isn't reachable at all, you want to give up before TCP retry behavior hangs you for minutes. `ServerConnectInfo` takes an optional connection timeout. With one set, you get `ConnectionFailedTimeout` after the duration you pick. The difference between "hangs forever" and "fails in five seconds" is pretty much the entire game when something is broken in production.
82+
83+
## Bulk data and streaming
84+
85+
If you've got a million rows to load into Postgres, you don't want to issue a million `INSERT` statements. You want `COPY ... FROM STDIN`, which lets you stream the data straight into a table at full pipe. The driver's design for it is pull-based: it tells you when it's ready for the next chunk, you send exactly one chunk in response, and the cycle repeats until you call `finish_copy()` (or `abort_copy()` if something goes wrong on your side and you want the server to roll back). Memory usage stays bounded because there's only ever one chunk in flight.
86+
87+
```pony
88+
actor BulkLoader is (SessionStatusNotify & ResultReceiver & CopyInReceiver)
89+
var _rows_sent: USize = 0
90+
91+
be pg_session_authenticated(session: Session) =>
92+
session.copy_in(
93+
"COPY my_table (name, value) FROM STDIN", this)
94+
95+
be pg_copy_ready(session: Session) =>
96+
_rows_sent = _rows_sent + 1
97+
if _rows_sent <= 1_000_000 then
98+
let row: Array[U8] val = recover val
99+
("row" + _rows_sent.string() + "\t"
100+
+ (_rows_sent * 10).string() + "\n").array()
101+
end
102+
session.send_copy_data(row)
103+
else
104+
session.finish_copy()
105+
end
106+
107+
be pg_copy_complete(session: Session, count: USize) =>
108+
// count = number of rows copied
109+
None
110+
111+
be pg_copy_failed(session: Session,
112+
failure: (ErrorResponseMessage | ClientQueryError))
113+
=>
114+
// handle the error
115+
None
116+
```
117+
118+
`COPY ... TO STDOUT` goes the other direction: the server pushes data to you, your `pg_copy_data` callback fires for each chunk, and `pg_copy_complete` fires when the export is done. Chunks don't necessarily align with row boundaries, so if you want row-level processing you buffer until you hit a newline.
119+
120+
Bulk export is great when you know how much data is coming. Row streaming is for when you don't. The result set might be a thousand rows. It might be a hundred million. `session.stream()` runs a query against a Postgres portal cursor and pulls results back in fixed-size batches. You hand it a window size, your `StreamingResultReceiver` gets a batch of rows at a time, and you call `fetch_more()` when you're ready for the next one. Bounded memory, on a Postgres feature that's been there forever and the driver finally knows how to use.
121+
122+
```pony
123+
session.stream(
124+
PreparedQuery("SELECT * FROM big_table",
125+
recover val Array[FieldDataTypes] end),
126+
100, my_receiver)
127+
128+
// In the receiver:
129+
be pg_stream_batch(session: Session, rows: Rows) =>
130+
// process this batch
131+
session.fetch_more() // pull the next one
132+
133+
be pg_stream_complete(session: Session) =>
134+
// all rows delivered
135+
```
136+
137+
`StreamingResultReceiver` also has a `pg_stream_failed` callback for errors and `session.close_stream()` lets you stop pulling early without draining the cursor.
138+
139+
## Talking back
140+
141+
Postgres has things to say to clients beyond just answering queries. The driver listens to all of them.
142+
143+
The biggest one is `LISTEN`/`NOTIFY`, Postgres's built-in pub/sub mechanism. You subscribe to a channel by running `LISTEN my_channel` as a normal query, and from that point on, anything that runs `NOTIFY my_channel, 'payload'` (from another connection, a trigger, whatever) generates a notification that gets pushed to you asynchronously. The driver delivers it through `pg_notification` on `SessionStatusNotify`:
144+
145+
```pony
146+
actor MyClient is (SessionStatusNotify & ResultReceiver)
147+
let _env: Env
148+
149+
new create(env: Env) =>
150+
_env = env
151+
152+
be pg_session_authenticated(session: Session) =>
153+
session.execute(SimpleQuery("LISTEN my_channel"), this)
154+
155+
be pg_notification(session: Session, notification: Notification) =>
156+
_env.out.print("Got: " + notification.channel
157+
+ " -> " + notification.payload)
158+
```
159+
160+
That's a database-backed message bus you can use without standing up a separate piece of infrastructure. People build a lot of useful things on top of `LISTEN`/`NOTIFY` once they realize their database can do it.
161+
162+
The other two callbacks are smaller and fire whenever the server wants to tell you something about a query or about itself. `pg_notice` delivers non-fatal messages like `RAISE NOTICE` output from a stored procedure or a `DROP TABLE IF EXISTS` telling you the table wasn't there. `pg_parameter_status` fires when a server runtime parameter changes, at startup or in response to a `SET` command. Neither is load-bearing for most applications, but if you're logging server-side events or adapting to server settings, they're how you get at them. All three callbacks have default no-op implementations, so you don't have to implement any of them unless you want to.
163+
164+
## Real types
165+
166+
The driver hands you Postgres values as Pony types. Numerics and booleans decode to their Pony equivalents. Temporal types decode to typed wrappers like `PgTimestamp` and `PgDate`, each with a `string()` method that formats the value the way Postgres would. Infinity and `-infinity` come through as type-max and type-min values, so you can tell them apart from real timestamps.
167+
168+
Two protocol paths, two decoding behaviors. `PreparedQuery`, `NamedPreparedQuery`, streaming, and pipelining all use Postgres's binary wire format and you get the typed values described above. `SimpleQuery` uses the text format and you get strings. Fire `SELECT NOW()` through a `SimpleQuery` and you get a `String`. Fire it through a `PreparedQuery` and you get a `PgTimestamp`. The driver picks based on the query type, not the column type.
169+
170+
```pony
171+
be pg_query_result(session: Session, result: Result) =>
172+
match result
173+
| let rs: ResultSet =>
174+
for row in rs.rows().values() do
175+
for field in row.fields.values() do
176+
match field.value
177+
| let s: String => // text or text-like column
178+
None
179+
| let i: I32 => // int4
180+
None
181+
| let t: PgTimestamp => // timestamp / timestamptz
182+
None
183+
| let d: PgDate => // date
184+
None
185+
| let b: Bytea => // bytea, raw bytes in b.data
186+
None
187+
| None => // SQL NULL
188+
None
189+
end
190+
end
191+
end
192+
end
193+
```
194+
195+
One-dimensional arrays of any built-in element type decode into `PgArray` and can be sent as parameters. `int4[]`, `text[]`, `timestamp[]`: they all just work. Multi-dimensional arrays don't, yet.
196+
197+
For types the driver doesn't ship with, you register your own codecs. Implement the `Codec` interface, register it against the type's OID (you can find one by querying `pg_type` from `psql`), and pass the registry into the session constructor. The release notes walk through a custom codec for Postgres `point`.
198+
199+
Enum and composite types get shortcuts so you don't have to write the codec yourself. `CodecRegistry.with_enum_type(oid)` registers an enum so it decodes as a `String` in both binary and text formats. `CodecRegistry.with_composite_type()` decodes a composite (the kind you create with `CREATE TYPE foo AS (...)`) into a `PgComposite` you can index by position or by field name.
200+
201+
`Field.value` is an open `FieldData` interface so custom codecs can plug in. Any `val` class with a `string()` method qualifies as a field value. There's also a second wrapper type, `RawBytes`, for binary-format columns whose OIDs the driver doesn't recognize. (`Bytea` is for known `bytea` columns; `RawBytes` is for unknown binary OIDs that show up before you've registered a codec.) The cost of an open interface is that `match` on `field.value` isn't exhaustive in the type system's eyes, even when you've handled every type the driver itself ships. That tradeoff is deliberate: closed-union exhaustiveness on field values is only useful if you can actually enumerate every type, and the moment user codecs are in play, you can't. The codecs you control are the ones you handle.
202+
203+
## Errors as data
204+
205+
Every time the driver hands you something that could be one of several things (what kind of result you got, why the connection failed, what kind of query it was), it's a closed union. Match exhaustively on any of them and the compiler will tell you if you missed a case. That covers `Result`, `ClientQueryError`, `AuthenticationFailureReason`, `ConnectionFailureReason`, `TransactionStatus`, `SSLMode`, `Query`, and `FieldDataTypes`.
206+
207+
This is the same pattern the rest of Pony pushes you toward: errors are data, not exceptions, and the type system should make sure you've thought about each one.
208+
209+
## What's still missing
210+
211+
The driver is beta. Here's what you won't find in it yet:
212+
213+
- Authentication methods beyond SCRAM-SHA-256, MD5, and cleartext password (Kerberos, GSSAPI, SCM, SSPI, SCRAM-SHA-256-PLUS with channel binding, and certificate-based auth)
214+
- Function calls via the legacy `Fastpath` protocol
215+
- Multi-dimensional arrays (one-dimensional arrays work)
216+
- Most startup configuration parameters beyond the basics
217+
218+
The README has the canonical list of what's in and what isn't. If any of these gaps are blocking your work, file an issue and we'll talk about it. PRs are welcome too. The driver goes where the Pony community pushes it.
219+
220+
## Try it
221+
222+
`ponylang/postgres` 0.3.1 is the current release. (0.3.0 had a backpressure stall in lori that 0.3.1 picked up a fix for. The 0.3.0 feature set is the one this post walks through.) Install with corral:
223+
224+
```bash
225+
corral add github.com/ponylang/postgres.git --version 0.3.1
226+
corral fetch
227+
```
228+
229+
You'll also need a C SSL library installed. The driver pulls in [`ponylang/ssl`](https://github.com/ponylang/ssl) as a transitive dependency, and that needs OpenSSL or LibreSSL on your system. The [ssl installation instructions](https://github.com/ponylang/ssl?tab=readme-ov-file#installation) cover what to install where.
230+
231+
The [README](https://github.com/ponylang/postgres) has the full API surface. The [release notes](https://github.com/ponylang/postgres/releases/tag/0.3.0) have the details on every entry I covered here, plus the ones I didn't (equality on `Field`, `Row`, and `Rows`, sending a `Terminate` message before close, follow-up queries from receiver callbacks). If you find something broken or missing, the [issue tracker](https://github.com/ponylang/postgres/issues) is where to put it.
232+
233+
Pony has a Postgres driver now. Not the proof-of-concept version we'd had since 2023. The actual one. I know. Three years late, but hey, now it's yours to break.

0 commit comments

Comments
 (0)