Skip to content

Commit 10db8da

Browse files
authored
Merge pull request #1 from SSheppDev/feat/multi-org-schemas
prepare 1.1.0 multi-org per-schema release
2 parents 00160fc + 402d966 commit 10db8da

36 files changed

Lines changed: 1975 additions & 852 deletions

CLAUDE.md

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -105,18 +105,22 @@ sf-db/
105105
- All database queries go through the pg pool — never create ad-hoc connections
106106

107107
### Postgres
108-
- Synced Salesforce data → `salesforce` schema
109108
- Internal app tables → `sfdb` schema
109+
- Synced Salesforce data → one schema per registered org named `org_<lowercased orgid>` (e.g. `org_00d5g000001abcdeaa`)
110+
- All `sfdb.*` per-object/per-field tables (`sync_config`, `field_config`, `field_metadata`, `sync_log`, `sync_lock`) are keyed by `(org_id, ...)` with `ON DELETE CASCADE` from `sfdb.orgs`
111+
- The active UI/sync context is stored in `sfdb.active_org` (single row); the API resolves it from `X-Org-Id` request header first, falling back to that pointer
110112
- Every synced table must have: `id`, `sf_created_at`, `sf_updated_at`, `sf_deleted_at`, `synced_at`
111113
- Field names are lowercase snake_case versions of SF API names
112-
- DDL is always idempotent (`IF NOT EXISTS` / `IF EXISTS`)
114+
- DDL is always idempotent (`IF NOT EXISTS` / `IF EXISTS`); identifiers are always quoted (objects like `Order` / `User` collide with PG reserved words)
113115

114116
### Sync engine
115-
- Always acquire `sfdb.sync_lock` before running any sync
116-
- Always release the lock in a `finally` block — never leave it held on error
117+
- Every sync entry point takes `orgId` as its primary key; alias is only used to look up an `~/.sfdx` token via `sfdb.orgs`
118+
- `sfdb.sync_lock` is per-org (one row per registered org). Acquire before any sync; always release in a `finally` block
119+
- Different orgs sync in parallel; one sync per org is serialized via that org's lock
117120
- If `last_delta_sync` is NULL → initial full load (no SystemModstamp WHERE clause)
118121
- Stale lock threshold: 30 minutes
119122
- Log purge runs at the start of every sync (delete rows older than `LOG_RETENTION_DAYS`)
123+
- The cron scheduler runs as one process with two ticks (delta per minute, full daily 02:00) that iterate every registered org
120124

121125
### API
122126
- All routes under `/api/` prefix
@@ -151,9 +155,10 @@ All runtime config (active org alias, sync intervals, enabled objects/fields) li
151155

152156
## Key Design Decisions (do not revisit without good reason)
153157

154-
- **sf CLI binary is NOT in the Docker image.** Auth tokens are read directly from the `~/.sf/` JSON files mounted into the container. No `sf org display` command.
158+
- **sf CLI binary is NOT in the Docker image.** Auth tokens are read directly from the `~/.sfdx/` JSON files mounted into the container. No `sf org display` command.
155159
- **The API is not a data API.** It serves the UI and orchestrates syncs only. External tools connect directly to Postgres.
156160
- **Deletions are soft.** `sf_deleted_at` is set — records are never hard-deleted from the local DB.
157161
- **Bulk API 2.0 by default.** REST query fallback only for objects under 2,000 records.
158-
- **Config in DB, not `.env`.** `.env` is infrastructure only. Org alias, object selection, field selection, and schedule config all live in `sfdb.app_config` / `sfdb.sync_config` / `sfdb.field_config`.
159-
- **One active org at a time.** Multi-org simultaneous sync is out of scope for v1.
162+
- **Config in DB, not `.env`.** `.env` is infrastructure only. Org registry, object selection, field selection, and schedule config all live in `sfdb.orgs` / `sfdb.sync_config` / `sfdb.field_config` / `sfdb.app_config`.
163+
- **Multi-org by schema.** Every registered org gets its own `org_<orgid>` schema. Removing an org drops the schema and cascades through `sfdb.*` via the FKs on `sfdb.orgs(org_id)`.
164+
- **Schema name is derived from the immutable Salesforce org id**, not the user-editable alias — aliases can be renamed without affecting where the data lives.

README.md

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -31,21 +31,26 @@ A self-hosted Salesforce-to-PostgreSQL sync pipeline. Run it with `docker compos
3131
# 1. Authenticate a Salesforce org (skip if already done)
3232
sf org login web --alias my-org
3333

34-
# 2. Configure environment
34+
# 2. Export decrypted Salesforce tokens for Docker to use
35+
npm run export-tokens
36+
37+
# 3. Configure environment
3538
cp .env.example .env
3639
# Edit .env — set POSTGRES_PASSWORD at minimum
3740

38-
# 3. Start
41+
# 4. Start
3942
docker compose up -d
4043

41-
# 4. Open the UI
44+
# 5. Open the UI
4245
open http://localhost:7743
4346
```
4447

4548
First start takes ~30 seconds while Postgres initializes and the API container builds.
4649

4750
The onboarding screen will detect your authenticated orgs and ask you to pick one. After that, go to the Objects page and enable the Salesforce objects you want to sync.
4851

52+
`npm run export-tokens` writes plaintext access tokens to `data/tokens.json` so the Docker container can authenticate to Salesforce. This file is local-only secret material, is git-ignored, and should never be committed or shared.
53+
4954
## Connect a BI tool or SQL client
5055

5156
Once data is syncing, connect any Postgres-compatible tool directly:
@@ -54,12 +59,12 @@ Once data is syncing, connect any Postgres-compatible tool directly:
5459
|----------|-----------------------|
5560
| Host | `localhost` |
5661
| Port | `7745` |
57-
| Database | `sfdb` |
58-
| Schema | `salesforce` |
59-
| User | `sfdb` |
62+
| Database | `sfdb` |
63+
| Schema | `org_<orgid>` |
64+
| User | `sfdb` |
6065
| Password | *(your `.env` value)* |
6166

62-
The Settings page in the UI shows a copyable connection string.
67+
Each registered Salesforce org gets its own schema named `org_<lowercased 18-char Salesforce org id>`. The Settings page in the UI shows the schema name for every registered org and a copyable connection string.
6368

6469
A read-only role is also available — set `READONLY_PASSWORD` in `.env` and connect as user `sfdb_readonly`.
6570

@@ -103,11 +108,11 @@ Queries `SELECT Id FROM <Object>` for the full live ID set, diffs against local
103108

104109
### Concurrency
105110

106-
Only one sync runs at a time. A single-row lock table (`sfdb.sync_lock`) prevents overlap. Stale locks (> 30 min) are automatically reclaimed on startup.
111+
Sync is serialized per org via `sfdb.sync_lock`, with one lock row per registered org. Different orgs can sync in parallel; overlapping syncs for the same org are blocked. Stale locks (> 30 min) are automatically reclaimed on startup.
107112

108113
## Database schema
109114

110-
**`salesforce` schema** — one table per enabled Salesforce object, e.g. `salesforce.account`
115+
**One schema per registered org**named `org_<lowercased orgid>`, one table per enabled Salesforce object (e.g. `org_00d5g000001abcdeaa.account`)
111116

112117
| Column | Type | Notes |
113118
|---|---|---|
@@ -118,7 +123,7 @@ Only one sync runs at a time. A single-row lock table (`sfdb.sync_lock`) prevent
118123
| `sf_deleted_at` | `timestamptz NULL` | NULL = live; set when deletion detected |
119124
| `synced_at` | `timestamptz` | Last written by this tool |
120125

121-
**`sfdb` schema** — internal app tables (sync config, logs, lock, field metadata)
126+
**`sfdb` schema** — internal app tables (`orgs` registry, `active_org` pointer, sync config, logs, per-org lock, field metadata). Per-object tables are keyed by `(org_id, ...)`.
122127

123128
## Tech stack
124129

@@ -127,7 +132,7 @@ Only one sync runs at a time. A single-row lock table (`sfdb.sync_lock`) prevent
127132
| Database | PostgreSQL 16 |
128133
| Backend | Node.js + TypeScript + Express |
129134
| Frontend | React + TypeScript + shadcn/ui + Tailwind |
130-
| Salesforce auth | `~/.sfdx` files read directly via Node `fs` (no `sf` binary in container) |
135+
| Salesforce auth | `~/.sfdx` files read directly via Node `fs` (no `sf` binary in container) — multiple orgs supported, each gets its own Postgres schema |
131136
| Salesforce data | jsforce + Bulk API 2.0 |
132137
| Scheduling | node-cron |
133138
| Containers | Docker + Docker Compose |

docker-compose.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,10 @@ services:
3636
environment:
3737
POSTGRES_HOST: postgres
3838
POSTGRES_PORT_INTERNAL: 5432
39+
# Cap V8 old-space below the cgroup limit so the GC can recover before the
40+
# OS kills the process. Large bulk-API result pages (700k+ records) can
41+
# otherwise blow past the default heap before streaming releases memory.
42+
NODE_OPTIONS: --max-old-space-size=1536
3943
ports:
4044
- "127.0.0.1:${APP_PORT:-7743}:7743"
4145
volumes:
@@ -49,7 +53,7 @@ services:
4953
deploy:
5054
resources:
5155
limits:
52-
memory: 512m
56+
memory: 2g
5357
cpus: '1.0'
5458

5559
networks:

docs/first-run.md

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,15 @@ Verify it worked:
1717
sf org list
1818
```
1919

20-
## 2. Configure environment
20+
## 2. Export decrypted Salesforce tokens for Docker
21+
22+
```bash
23+
npm run export-tokens
24+
```
25+
26+
This writes `data/tokens.json`, a local-only secret file consumed by the API container. It is git-ignored and should never be committed.
27+
28+
## 3. Configure environment
2129

2230
```bash
2331
cp .env.example .env
@@ -27,7 +35,7 @@ Edit `.env` if you need to change ports or the DB password. Defaults:
2735
- UI + API: `http://localhost:7743`
2836
- PostgreSQL: `localhost:7745`
2937

30-
## 3. Start the app
38+
## 4. Start the app
3139

3240
```bash
3341
docker compose up -d
@@ -41,15 +49,15 @@ docker compose ps
4149
docker compose logs -f api
4250
```
4351

44-
## 4. Open the UI
52+
## 5. Open the UI
4553

4654
```
4755
http://localhost:7743
4856
```
4957

5058
The onboarding screen will detect your authenticated orgs and ask you to pick one.
5159

52-
## 5. Connect your BI tool / SQL client
60+
## 6. Connect your BI tool / SQL client
5361

5462
Once data is syncing, connect directly to Postgres:
5563

@@ -58,9 +66,9 @@ Once data is syncing, connect directly to Postgres:
5866
| Host | `localhost` |
5967
| Port | `7745` (or `$POSTGRES_PORT` from `.env`) |
6068
| Database | `sfdb` |
61-
| Schema | `salesforce` |
69+
| Schema | `org_<orgid>` |
6270
| User | `sfdb` (or `$POSTGRES_USER`) |
63-
| Password | `changeme` (or `$POSTGRES_PASSWORD`) |
71+
| Password | your `.env` `POSTGRES_PASSWORD` value |
6472

6573
The Settings page in the UI shows a copyable connection string.
6674

0 commit comments

Comments
 (0)