Skip to content

Commit d06cab3

Browse files
fuziontechclaude
andauthored
Add control plane / data plane architecture (#146)
* Add control plane / data plane architecture for zero-downtime deployments Implement a multi-process architecture that splits duckgres into a control plane (connection management, routing) and data plane (pool of long-lived DuckDB worker processes). This enables zero-downtime deployments, cross-session DuckDB cache reuse, and rolling worker updates. Key components: - gRPC-based worker management (Configure, Health, Drain, Shutdown) - Unix socket FD passing via SCM_RIGHTS for TCP connection handoff - Least-connections load balancing across worker pool - Graceful control plane handover via listener FD transfer - Rolling worker updates triggered by SIGUSR2 - Health check loop with automatic worker restart New CLI modes: --mode control-plane | worker | standalone (default) Standalone mode (existing behavior) is completely unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix done channel leak and per-query cancellation in control plane ConnectExistingWorker never closed the done channel, causing ShutdownAll and RollingUpdate to always hit their timeout for handed-over workers. Add a health-check monitoring goroutine. CancelQuery killed the entire session instead of just the running query. Use the per-session minServer.CancelQuery() to cancel only the in-flight query, matching standalone mode behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix lint errors: unchecked Close() returns and unused code - Add _ = prefix to all unchecked .Close() return values (errcheck) - Remove unused nextWorker field from WorkerPool (unused) - Remove unused activeQueriesMu field from Worker (unused) - Remove unused loadExtensions/attachDuckLake method wrappers (unused) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 9a25bf9 commit d06cab3

19 files changed

Lines changed: 3718 additions & 83 deletions

CLAUDE.md

Lines changed: 33 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,24 +8,39 @@ Duckgres is a PostgreSQL wire protocol server backed by DuckDB. It allows any Po
88

99
## Architecture
1010

11+
Duckgres supports three run modes: `standalone` (default), `control-plane`, and `worker`.
12+
1113
```
12-
PostgreSQL Client → TLS → Duckgres Server → DuckDB (per-user database)
14+
Standalone: PostgreSQL Client → TLS → Duckgres Server → DuckDB (per-user database)
15+
Control Plane: PostgreSQL Client → TLS → Control Plane → (FD pass) → Worker → DuckDB
1316
```
1417

1518
### Key Components
1619

17-
- **main.go**: Entry point, configuration loading (CLI flags, env vars, YAML)
18-
- **server/server.go**: Server struct, connection handling, graceful shutdown
20+
- **main.go**: Entry point, configuration loading (CLI flags, env vars, YAML), mode routing
21+
- **server/server.go**: Server struct, connection handling, graceful shutdown, `CreateDBConnection()` (standalone function)
1922
- **server/conn.go**: Client connection handling, query execution, COPY protocol
2023
- **server/protocol.go**: PostgreSQL wire protocol message encoding/decoding
24+
- **server/exports.go**: Exported wrappers for protocol functions (used by control plane workers)
2125
- **server/catalog.go**: pg_catalog compatibility views and macros initialization
2226
- **server/types.go**: Type OID mapping between DuckDB and PostgreSQL
2327
- **server/ratelimit.go**: Rate limiting for brute-force protection
2428
- **server/certs.go**: Auto-generation of self-signed TLS certificates
29+
- **server/parent.go**: Child process spawning for ProcessIsolation mode
30+
- **server/worker.go**: Per-connection child worker (ProcessIsolation mode)
2531
- **transpiler/**: AST-based SQL transpiler (PostgreSQL → DuckDB)
2632
- `transpiler.go`: Main API, transform pipeline orchestration
2733
- `config.go`: Configuration types (DuckLakeMode, ConvertPlaceholders)
2834
- `transform/`: Individual transform implementations
35+
- **controlplane/**: Multi-process control plane architecture
36+
- `proto/worker.proto`: gRPC service definition (Configure, AcceptConnection, CancelQuery, Drain, Health, Shutdown)
37+
- `proto/*.pb.go`: Generated gRPC/protobuf code
38+
- `fdpass/fdpass.go`: Unix socket FD passing via SCM_RIGHTS
39+
- `worker.go`: Long-lived worker process (gRPC server, FD receiver, session handler)
40+
- `dbpool.go`: Per-session DuckDB database pool management
41+
- `control.go`: Control plane main loop (TCP listener, rate limiting, connection routing)
42+
- `pool.go`: Worker pool management (spawn, health check, least-connections routing, rolling update)
43+
- `handover.go`: Graceful deployment (listener FD transfer between control planes)
2944

3045
## PostgreSQL Wire Protocol
3146

@@ -74,10 +89,24 @@ Supports bulk data transfer:
7489
- **COPY FROM STDIN**: Receives data from client, inserts row by row
7590
- Supports CSV format with HEADER, DELIMITER, and NULL options
7691

92+
## Run Modes
93+
94+
- **standalone** (default): Single process, handles everything. Current behavior unchanged.
95+
- **control-plane**: Multi-process. Accepts TCP connections, passes FDs to worker pool via Unix sockets.
96+
- **worker**: Long-lived child process spawned by control plane. Handles TLS, auth, query execution via gRPC + FD passing.
97+
98+
Key CLI flags for control plane mode:
99+
- `--mode control-plane|worker|standalone`
100+
- `--worker-count N` (default 4)
101+
- `--socket-dir /path` (Unix sockets for gRPC + FD passing)
102+
- `--handover-socket /path` (graceful deployment between control planes)
103+
- `--grpc-socket /path` (worker, set by control plane at spawn)
104+
- `--fd-socket /path` (worker, set by control plane at spawn)
105+
77106
## Configuration
78107

79108
Three-tier configuration (highest to lowest priority):
80-
1. CLI flags (`--port`, `--config`, etc.)
109+
1. CLI flags (`--port`, `--config`, `--mode`, etc.)
81110
2. Environment variables (`DUCKGRES_PORT`, etc.)
82111
3. YAML config file
83112
4. Built-in defaults

README.md

Lines changed: 80 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ A PostgreSQL wire protocol compatible server backed by DuckDB. Connect with any
2626
- [Rate Limiting](#rate-limiting)
2727
- [Usage Examples](#usage-examples)
2828
- [Architecture](#architecture)
29+
- [Standalone Mode](#standalone-mode)
30+
- [Control Plane Mode](#control-plane-mode)
2931
- [Two-Tier Query Processing](#two-tier-query-processing)
3032
- [Supported Features](#supported-features)
3133
- [Limitations](#limitations)
@@ -45,6 +47,7 @@ A PostgreSQL wire protocol compatible server backed by DuckDB. Connect with any
4547
- **DuckLake Integration**: Auto-attach DuckLake catalogs for lakehouse workflows
4648
- **Rate Limiting**: Built-in protection against brute-force attacks
4749
- **Graceful Shutdown**: Waits for in-flight queries before exiting
50+
- **Control Plane Mode**: Multi-process architecture with long-lived workers, zero-downtime deployments, and rolling updates
4851
- **Flexible Configuration**: YAML config files, environment variables, and CLI flags
4952
- **Prometheus Metrics**: Built-in metrics endpoint for monitoring
5053

@@ -177,12 +180,16 @@ export POSTHOG_HOST=eu.i.posthog.com
177180
./duckgres --help
178181

179182
Options:
180-
-config string Path to YAML config file
181-
-host string Host to bind to
182-
-port int Port to listen on
183-
-data-dir string Directory for DuckDB files
184-
-cert string TLS certificate file
185-
-key string TLS private key file
183+
-config string Path to YAML config file
184+
-host string Host to bind to
185+
-port int Port to listen on
186+
-data-dir string Directory for DuckDB files
187+
-cert string TLS certificate file
188+
-key string TLS private key file
189+
-mode string Run mode: standalone (default), control-plane, or worker
190+
-worker-count int Number of worker processes (control-plane mode, default 4)
191+
-socket-dir string Unix socket directory (control-plane mode)
192+
-handover-socket string Handover socket for graceful deployment (control-plane mode)
186193
```
187194

188195
## DuckDB Extensions
@@ -428,6 +435,12 @@ GROUP BY name;
428435

429436
## Architecture
430437

438+
Duckgres supports two run modes: **standalone** (single process, default) and **control-plane** (multi-process with worker pool).
439+
440+
### Standalone Mode
441+
442+
The default mode runs everything in a single process:
443+
431444
```
432445
┌─────────────────┐
433446
│ PostgreSQL │
@@ -449,6 +462,64 @@ GROUP BY name;
449462
└─────────────────┘
450463
```
451464

465+
### Control Plane Mode
466+
467+
For production deployments, control-plane mode splits the server into a **control plane** (connection management, routing) and a pool of long-lived **worker processes** (query execution). This enables zero-downtime deployments and cross-session DuckDB cache reuse.
468+
469+
```
470+
CONTROL PLANE (duckgres --mode control-plane)
471+
┌──────────────────────────────────────────┐
472+
PG Client ──TLS──>│ TCP Listener │
473+
│ Rate Limiting │
474+
│ Connection Router (least-connections) │
475+
│ │ FD pass via Unix socket (SCM_RIGHTS) │
476+
│ ▼ │
477+
│ gRPC Client ─────────────────────────+ │
478+
└──────────────────────────────────────────┘
479+
480+
gRPC (UDS)
481+
482+
WORKER POOL ▼
483+
┌──────────────────────────────────────────┐
484+
│ Worker 1 (duckgres --mode worker) │
485+
│ gRPC Server (Configure, Health, Drain) │
486+
│ FD Receiver (Unix socket) │
487+
│ Shared DuckDB instance (long-lived) │
488+
│ ├── Session 1 (goroutine) │
489+
│ ├── Session 2 (goroutine) │
490+
│ └── Session N ... │
491+
├──────────────────────────────────────────┤
492+
│ Worker 2 ... │
493+
└──────────────────────────────────────────┘
494+
```
495+
496+
Start in control-plane mode:
497+
498+
```bash
499+
# Start with 4 workers (default)
500+
./duckgres --mode control-plane --port 5432 --worker-count 4
501+
502+
# Connect with psql (identical to standalone mode)
503+
PGPASSWORD=postgres psql "host=localhost port=5432 user=postgres sslmode=require"
504+
```
505+
506+
**Zero-downtime deployment** using the handover protocol:
507+
508+
```bash
509+
# Start the first control plane with a handover socket
510+
./duckgres --mode control-plane --port 5432 --handover-socket /var/run/duckgres/handover.sock
511+
512+
# Deploy a new version - it takes over the listener and workers without dropping connections
513+
./duckgres-v2 --mode control-plane --port 5432 --handover-socket /var/run/duckgres/handover.sock
514+
```
515+
516+
**Rolling worker updates** via signal:
517+
518+
```bash
519+
# Replace workers one at a time (drains sessions before replacing each worker)
520+
kill -USR2 <control-plane-pid>
521+
```
522+
452523
## Two-Tier Query Processing
453524

454525
Duckgres uses a two-tier approach to handle both PostgreSQL and DuckDB-specific SQL syntax transparently:
@@ -509,9 +580,9 @@ The following DuckDB features work transparently through the fallback mechanism:
509580

510581
## Limitations
511582

512-
- **Single Process**: Each user's database is opened in the same process
513-
- **No Replication**: Single-node only
514-
- **Limited System Catalog**: Some `pg_*` system tables are not available
583+
- **Single Node**: No built-in replication or clustering
584+
- **Limited System Catalog**: Some `pg_*` system tables are stubs (return empty)
585+
- **Type OID Mapping**: Incomplete (some types show as "unknown")
515586

516587
## Dependencies
517588

0 commit comments

Comments
 (0)