DuckDB v1.5.4
First off, apologies if this has been answered or documented elsewhere. After days of looking and not finding anything I feel justified opening an issue to ask.
Workflow
I've got a workflow I'm exploring where I have client that is doing an expensive diff between current vs past observation of local state. The client is connected to a server on both a HTTP control plane port and a DuckDB Quack port. The full diff produces rows for insert, update and delete that feed into a table that tracks the last successful observation. And a CDC style changelog table that is a superset of the current observation table goes into a DuckLake catalog. So you get point-in-time tables and history tables.
Unfortunately Quack doesn't support update or delete. Details in the example below. Looking at the documentation and announcements this isn't documented anywhere as a limitation. There was one mention in the keynote that there were v2.0 features that would improve the ergonomics around remote queries e.g. FROM remote_db.query('SELECT 42'); maybe that was in reference to this because this was the only way I could issue an update/delete to the remote.
Workaround
While the remote_db.query macro can work, I'm replacing the whole row rather than crafting a bespoke query per change detected.
My workaround was to backchannel the updates and deletes over my HTTP control plane via a jsonl payload. For updates, I write the jsonl to a tempfile and load it into a temp table to do a joined update in a single operation. Delete is similar just with only the identifier column.
As workarounds go, I'm pretty happy with it. It does suck that I can't perform the entire sync in a single transaction.
Preferred Solution
The ideal interface would to perform the joined update local table to remote table. I saw the recent #161 and I got excited that this might enable something like this.
Repro
import duckdb
import tempfile
import os
# --- server ---
tmp = tempfile.mkdtemp()
server_path = os.path.join(tmp, "server.duckdb")
server = duckdb.connect(server_path)
server.execute("CREATE TABLE foo (id INTEGER, name VARCHAR)")
server.execute("INSERT INTO foo VALUES (1, 'alpha'), (2, 'beta')")
server.execute("CALL quack_serve('quack:localhost', token = 'super_secret')")
# --- client (separate in-memory connection) ---
client = duckdb.connect()
client.execute("CREATE SECRET (TYPE quack, TOKEN 'super_secret')")
client.execute("ATTACH 'quack:localhost' AS srv")
for label, sql in [
("INSERT", "INSERT INTO srv.foo VALUES (3, 'gamma')"),
("UPDATE", "UPDATE srv.foo SET name = 'ALPHA' WHERE id = 1"),
("DELETE", "DELETE FROM srv.foo WHERE id = 2"),
]:
try:
client.execute(sql)
print(f"{label}: OK")
except Exception as e:
print(f"{label}: FAILED — {e}")
print("\nfinal server state:")
print(server.execute("SELECT * FROM foo ORDER BY id").fetchall())
server.close()
client.close()
# Expected output on current DuckDB v1.5.4:
# INSERT: OK
# UPDATE: FAILED — Binder Error: Can only update base table
# DELETE: FAILED — Binder Error: Can only delete from base table
# final server state:
# [(1, 'alpha'), (2, 'beta'), (3, 'gamma')]
DuckDB v1.5.4
First off, apologies if this has been answered or documented elsewhere. After days of looking and not finding anything I feel justified opening an issue to ask.
Workflow
I've got a workflow I'm exploring where I have client that is doing an expensive diff between current vs past observation of local state. The client is connected to a server on both a HTTP control plane port and a DuckDB Quack port. The full diff produces rows for insert, update and delete that feed into a table that tracks the last successful observation. And a CDC style changelog table that is a superset of the current observation table goes into a DuckLake catalog. So you get point-in-time tables and history tables.
Unfortunately Quack doesn't support update or delete. Details in the example below. Looking at the documentation and announcements this isn't documented anywhere as a limitation. There was one mention in the keynote that there were v2.0 features that would improve the ergonomics around remote queries e.g.
FROM remote_db.query('SELECT 42');maybe that was in reference to this because this was the only way I could issue an update/delete to the remote.Workaround
While the
remote_db.querymacro can work, I'm replacing the whole row rather than crafting a bespoke query per change detected.My workaround was to backchannel the updates and deletes over my HTTP control plane via a
jsonlpayload. For updates, I write thejsonlto a tempfile and load it into a temp table to do a joined update in a single operation. Delete is similar just with only the identifier column.As workarounds go, I'm pretty happy with it. It does suck that I can't perform the entire sync in a single transaction.
Preferred Solution
The ideal interface would to perform the joined update local table to remote table. I saw the recent #161 and I got excited that this might enable something like this.
Repro