sfetch

A self-hosted Salesforce-to-PostgreSQL sync pipeline. Run it with docker compose up, point a BI tool or SQL client at the local Postgres instance, and query your Salesforce data like a normal database.

The database is the product. The web UI configures which objects and fields to sync. External tools connect directly to Postgres — no intermediate API.

Security note: The web UI and API have no authentication. This is intentional — it is a localhost-only tool. Both services bind to 127.0.0.1 and must not be exposed to a network. Do not run this on a shared or internet-accessible host without adding an auth layer.

Features

Configure sync from a web UI — no config files to edit
Select individual objects and fields to sync
Delta sync (frequent, catches creates/updates) + full ID reconciliation (nightly, catches hard deletes)
Soft deletes: records deleted in Salesforce get sf_deleted_at set, never hard-deleted locally
Salesforce Bulk API 2.0 for large volumes; REST fallback for small objects
Auth via ~/.sfdx files — no Salesforce credentials stored in the project
Sync logs with per-object record counts, errors, and duration

Prerequisites

Docker Desktop
Salesforce CLI (sf) with at least one org authenticated

Quick start

# 1. Authenticate a Salesforce org (skip if already done)
sf org login web --alias my-org

# 2. Export decrypted Salesforce tokens for Docker to use
npm run export-tokens

# 3. Configure environment
cp .env.example .env
# Edit .env — set POSTGRES_PASSWORD at minimum

# 4. Start
docker compose up -d

# 5. Open the UI
open http://localhost:7743

First start takes ~30 seconds while Postgres initializes and the API container builds.

The onboarding screen will detect your authenticated orgs and ask you to pick one. After that, go to the Objects page and enable the Salesforce objects you want to sync.

npm run export-tokens writes plaintext access tokens to data/tokens.json so the Docker container can authenticate to Salesforce. This file is local-only secret material, is git-ignored, and should never be committed or shared.

Connect a BI tool or SQL client

Once data is syncing, connect any Postgres-compatible tool directly:

Setting	Default value
Host	`localhost`
Port	`7745`
Database	`sfdb`
Schema	`org_<orgid>`
User	`sfdb`
Password	(your `.env` value)

Each registered Salesforce org gets its own schema named org_<lowercased 18-char Salesforce org id>. The Settings page in the UI shows the schema name for every registered org and a copyable connection string.

A read-only role is also available — set READONLY_PASSWORD in .env and connect as user sfdb_readonly.

Ports

Service	Default	Set via
UI + API	`7743`	`APP_PORT`
PostgreSQL	`7745`	`POSTGRES_PORT`

Both default ports are chosen to avoid conflicts with common local services.

Environment variables

.env holds only bootstrap config — values needed before the database exists. All runtime config (active org, sync intervals, enabled objects) lives in the database.

POSTGRES_USER=sfdb
POSTGRES_PASSWORD=            # required
POSTGRES_DB=sfdb
POSTGRES_PORT=7745
READONLY_PASSWORD=            # optional read-only role password

APP_PORT=7743
NODE_ENV=production

LOG_RETENTION_DAYS=14

Copy .env.example to .env and fill in at minimum POSTGRES_PASSWORD.

How sync works

Delta sync (default: every hour)

Queries WHERE SystemModstamp >= last_delta_sync via Bulk API 2.0, streams the CSV result, and batch-upserts into Postgres. On first sync, no WHERE clause — pulls all records.

Full ID reconciliation (default: nightly)

Queries SELECT Id FROM <Object> for the full live ID set, diffs against local rows, and sets sf_deleted_at on any that are gone. This is the only way to catch hard deletes, merges, and cascade deletes.

Concurrency

Sync is serialized per org via sfdb.sync_lock, with one lock row per registered org. Different orgs can sync in parallel; overlapping syncs for the same org are blocked. Stale locks (> 30 min) are automatically reclaimed on startup.

Database schema

One schema per registered org — named org_<lowercased orgid>, one table per enabled Salesforce object (e.g. org_00d5g000001abcdeaa.account)

Column	Type	Notes
`id`	`text PRIMARY KEY`	18-char Salesforce ID
(enabled fields)	(mapped type)	Lowercased snake_case API names
`sf_created_at`	`timestamptz`	CreatedDate
`sf_updated_at`	`timestamptz`	SystemModstamp
`sf_deleted_at`	`timestamptz NULL`	NULL = live; set when deletion detected
`synced_at`	`timestamptz`	Last written by this tool

sfdb schema — internal app tables (orgs registry, active_org pointer, sync config, logs, per-org lock, field metadata). Per-object tables are keyed by (org_id, ...).

Tech stack

Layer	Technology
Database	PostgreSQL 16
Backend	Node.js + TypeScript + Express
Frontend	React + TypeScript + shadcn/ui + Tailwind
Salesforce auth	`~/.sfdx` files read directly via Node `fs` (no `sf` binary in container) — multiple orgs supported, each gets its own Postgres schema
Salesforce data	jsforce + Bulk API 2.0
Scheduling	node-cron
Containers	Docker + Docker Compose

Stopping and data persistence

docker compose down       # stop containers — data persists
docker compose down -v    # stop and delete all data

Postgres data lives in data/docker/postgres/ (git-ignored). It survives stop/start cycles.

Rebuilding after code changes

docker compose down
docker compose build
docker compose up -d

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
docker		docker
docs		docs
scripts		scripts
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sfetch

Features

Prerequisites

Quick start

Connect a BI tool or SQL client

Ports

Environment variables

How sync works

Delta sync (default: every hour)

Full ID reconciliation (default: nightly)

Concurrency

Database schema

Tech stack

Stopping and data persistence

Rebuilding after code changes

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sfetch

Features

Prerequisites

Quick start

Connect a BI tool or SQL client

Ports

Environment variables

How sync works

Delta sync (default: every hour)

Full ID reconciliation (default: nightly)

Concurrency

Database schema

Tech stack

Stopping and data persistence

Rebuilding after code changes

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages