@@ -142,6 +142,136 @@ ATTACH 'ducklake:postgres:host=ducklake.example.com user=ducklake password=secre
142142
143143See [ DuckLake documentation] ( https://ducklake.select/docs/stable/duckdb/usage/connecting ) for more details.
144144
145+ ### Quick Start with Docker
146+
147+ The easiest way to get started with DuckLake is using the included Docker Compose setup:
148+
149+ ``` bash
150+ # Start PostgreSQL (metadata) and MinIO (object storage)
151+ docker compose up -d
152+
153+ # Wait for services to be ready
154+ docker compose logs -f # Look for "Bucket ducklake created successfully"
155+
156+ # Start Duckgres with DuckLake configured
157+ ./duckgres --config duckgres.yaml
158+
159+ # Connect and start using DuckLake
160+ PGPASSWORD=postgres psql " host=localhost port=5432 user=postgres sslmode=require"
161+ ```
162+
163+ The ` docker-compose.yaml ` creates:
164+
165+ ** PostgreSQL** (metadata catalog):
166+ - Host: ` localhost `
167+ - Port: ` 5433 ` (mapped to avoid conflicts)
168+ - Database: ` ducklake `
169+ - User/Password: ` ducklake ` / ` ducklake `
170+
171+ ** MinIO** (S3-compatible object storage):
172+ - S3 API: ` localhost:9000 `
173+ - Web Console: ` http://localhost:9001 `
174+ - Access Key: ` minioadmin `
175+ - Secret Key: ` minioadmin `
176+ - Bucket: ` ducklake ` (auto-created on startup)
177+
178+ The included ` duckgres.yaml ` is pre-configured to use both services.
179+
180+ ### Object Storage Configuration
181+
182+ DuckLake can store data files in S3-compatible object storage (AWS S3, MinIO, etc.). Two credential providers are supported:
183+
184+ #### Option 1: Explicit Credentials (MinIO / Access Keys)
185+
186+ ``` yaml
187+ ducklake :
188+ metadata_store : " postgres:host=localhost port=5433 user=ducklake password=ducklake dbname=ducklake"
189+ object_store : " s3://ducklake/data/"
190+ s3_provider : " config" # Explicit credentials (default if s3_access_key is set)
191+ s3_endpoint : " localhost:9000" # MinIO or custom S3 endpoint
192+ s3_access_key : " minioadmin"
193+ s3_secret_key : " minioadmin"
194+ s3_region : " us-east-1"
195+ s3_use_ssl : false
196+ s3_url_style : " path" # "path" for MinIO, "vhost" for AWS S3
197+ ` ` `
198+
199+ #### Option 2: AWS Credential Chain (IAM Roles / Environment)
200+
201+ For AWS S3 with IAM roles, environment variables, or config files:
202+
203+ ` ` ` yaml
204+ ducklake :
205+ metadata_store : " postgres:host=localhost user=ducklake password=ducklake dbname=ducklake"
206+ object_store : " s3://my-bucket/ducklake/"
207+ s3_provider : " credential_chain" # AWS SDK credential chain
208+ s3_chain : " env;config" # Which sources to check (optional)
209+ s3_profile : " my-profile" # AWS profile name (optional)
210+ s3_region : " us-west-2" # Override auto-detected region (optional)
211+ ` ` `
212+
213+ The credential chain checks these sources in order:
214+ - ` env` - Environment variables (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`)
215+ - ` config` - AWS config files (`~/.aws/credentials`, `~/.aws/config`)
216+ - ` sts` - AWS STS assume role
217+ - ` sso` - AWS Single Sign-On
218+ - ` instance` - EC2 instance metadata (IAM roles)
219+ - ` process` - External process credentials
220+
221+ See [DuckDB S3 API docs](https://duckdb.org/docs/stable/core_extensions/httpfs/s3api#credential_chain-provider) for details.
222+
223+ # ### Environment Variables
224+
225+ All S3 settings can be configured via environment variables :
226+ - ` DUCKGRES_DUCKLAKE_OBJECT_STORE` - S3 path (e.g., `s3://bucket/path/`)
227+ - ` DUCKGRES_DUCKLAKE_S3_PROVIDER` - `config` or `credential_chain`
228+ - ` DUCKGRES_DUCKLAKE_S3_ENDPOINT` - S3 endpoint (for MinIO)
229+ - ` DUCKGRES_DUCKLAKE_S3_ACCESS_KEY` - Access key ID
230+ - ` DUCKGRES_DUCKLAKE_S3_SECRET_KEY` - Secret access key
231+ - ` DUCKGRES_DUCKLAKE_S3_REGION` - AWS region
232+ - ` DUCKGRES_DUCKLAKE_S3_USE_SSL` - Use HTTPS (true/false)
233+ - ` DUCKGRES_DUCKLAKE_S3_URL_STYLE` - `path` or `vhost`
234+ - ` DUCKGRES_DUCKLAKE_S3_CHAIN` - Credential chain sources
235+ - ` DUCKGRES_DUCKLAKE_S3_PROFILE` - AWS profile name
236+
237+ # ## Seeding Sample Data
238+
239+ A seed script is provided to populate DuckLake with sample e-commerce and analytics data :
240+
241+ ` ` ` bash
242+ # Seed with default connection (localhost:5432, postgres/postgres)
243+ ./scripts/seed_ducklake.sh
244+
245+ # Seed with custom connection
246+ ./scripts/seed_ducklake.sh --host 127.0.0.1 --port 5432 --user postgres --password postgres
247+
248+ # Clean existing tables and reseed
249+ ./scripts/seed_ducklake.sh --clean
250+ ` ` `
251+
252+ The script creates the following tables :
253+ - ` categories` - Product categories (5 rows)
254+ - ` products` - E-commerce products (15 rows)
255+ - ` customers` - Customer records (10 rows)
256+ - ` orders` - Order headers (12 rows)
257+ - ` order_items` - Order line items (20 rows)
258+ - ` events` - Analytics events with JSON properties (15 rows)
259+ - ` page_views` - Web analytics data (15 rows)
260+
261+ Example queries after seeding :
262+
263+ ` ` ` sql
264+ -- Top products by price
265+ SELECT name, price FROM products ORDER BY price DESC LIMIT 5;
266+
267+ -- Orders with customer info
268+ SELECT o.id, c.first_name, c.last_name, o.total_amount, o.status
269+ FROM orders o JOIN customers c ON o.customer_id = c.id;
270+
271+ -- Event funnel analysis
272+ SELECT event_name, COUNT(*) FROM events GROUP BY event_name ORDER BY COUNT(*) DESC;
273+ ` ` `
274+
145275# # COPY Protocol
146276
147277Duckgres supports PostgreSQL's COPY protocol for efficient bulk data import and export :
0 commit comments