Welcome, agent. This file is a fast-orientation guide so you don't have to rediscover the repo the hard way. It complements README.md, INSTALL.md, README-DBs.md, CHANGELOG-v2.md, and SINGLE-TO-MULTIPLE.md — read those for depth; read this one first for shape.
open-pryv.io is the v2 codebase of Pryv.io — a personal-data REST/WebSocket platform. It is a single Node.js codebase that produces a single binary (bin/master.js) and a single Docker image (pryvio/open-pryv.io). Registration, MFA, API, high-frequency series, previews, and the email renderer all run inside that binary as cluster workers.
Some sibling projects live in separate repos and are consumed as dependencies or downloaded as artifacts at deploy time — they are not bundled here:
pryv/lib-js— thepryvnpm package (JS client lib).pryv/pryv-datastore— the datastore interface that custom datastore plugins implement.pryv/app-web-auth3— the Vue.js auth/register/password-reset web pages.
The v1 line (pre-single-binary) is preserved on the release/1.9.3 branch and is no longer updated.
bin/ Entry points and admin CLIs
master.js Supervisor — forks cluster workers, runs rqlited, AcmeOrchestrator
bootstrap.js Multi-core onboarding (issue/consume sealed bundle)
backup.js Engine-agnostic backup + restore (JSONL + gzip)
migrate.js Schema migration runner (status / up)
dns-records.js Persistent DNS record admin
integrity-check.js Per-user integrity verification
mail.js Mail template admin (in-process email)
observability.js Optional APM admin (enable / disable / set-license-key)
components/ Source code split by domain
api-server/ HTTP REST API (Express)
hfs-server/ High-frequency series ingest
previews-server/ Image preview renderer
dns-server/ Embedded DNS server for multi-core DNS-based topology
mail/ In-process email renderer (Pug templates in PlatformDB)
middleware/ Express middleware (auth, wrong-core, regSubdomainPathMap, ...)
mall/ Data access layer (events, streams — engine-agnostic)
cache/ Caching layer
platform/ PlatformDB interface (rqlite) + config-snapshot hash drift
storage/ Storage facade — uses the engines selected in config
audit/ Audit logging (uses SQLite directly)
business/ Cross-domain logic:
accesses/ acme/ auth/ backup/ bootstrap/ integrity/
mfa/ observability/ series/ system-streams/ users/ webhooks/
webhooks/ Outbound event delivery
errors/ messages/ test-helpers/ tracing/ utils/ externals/
storages/ Plugin tree for storage engines (npm workspace)
interfaces/
baseStorage/ dataStore/ platformStorage/ fileStorage/
seriesStorage/ auditStorage/ backup/ migrations/
engines/
mongodb/ baseStorage, dataStore, auditStorage
postgresql/ baseStorage, dataStore, platformStorage, seriesStorage, auditStorage
sqlite/ per-user SQLite (baseStorage, dataStore, auditStorage)
rqlite/ distributed SQLite (platformStorage — always used in v2)
filesystem/ attachments + previews on disk (fileStorage)
influxdb/ high-frequency series (seriesStorage)
datastores/ Custom datastore plugins (e.g. `account` for system streams)
manifest-schema.js Schema for engine manifests
pluginLoader.js Reads `storages.<type>.engine` + loads the chosen engine
config/ default-config.yml + plugins
Dockerfile Canonical container (bundles rqlited, sharp, mongosh)
test/ Integration test entry — see `just test …` in justfile
Prerequisites: Node.js 22.x, MongoDB 4.2+ or PostgreSQL 14+, just.
just setup-dev-env # prepares var-pryv/ layout + launches PG/Mongo/rqlite binaries
just install # npm install across workspaces
just start-master # boot the single-binary clusterTests (from repo root, via justfile):
just test all # full suite, PostgreSQL baseStorage (default)
just test <component> # single component, PostgreSQL baseStorage
just test-mongo all # full suite, MongoDB baseStorage
just test-mongo <component> # single component, MongoDB baseStorage
just test-sqlite <component> # SQLite baseStorage where applicable
just clean-test-data # reset test DBs + per-user dirsProduction-ish single node:
NODE_ENV=production node bin/master.js --config /path/to/your/override-config.ymlNODE_ENV=test short-circuits optional integrations (observability providers, strict startup checks). Always honour it in new code — tests must stay hermetic.
-
bin/master.jsowns the lifecycle. It:- Runs a minimal
@pryv/boilerinit to read config, - Spawns and supervises an embedded
rqlitedin both single- and multi-core mode (no separate DB process to manage), - Forks cluster workers:
cluster.apiWorkers(default 2) API workers,cluster.hfsWorkersHFS workers,cluster.previewsWorkerpreviews worker, - On the CA-holder core only, runs the AcmeOrchestrator that polls PlatformDB for cert state (other cores poll + materialize),
- Handles the
--bootstrapmode used to add new cores (decrypts a sealed bundle, writesoverride-config.yml+ TLS files, acks the issuer, falls through into normal startup).
Don't add a PM2 / systemd / Docker-compose-style process supervisor around it. master.js is the supervisor.
- Runs a minimal
-
TLS termination is native. api-server workers call
https.createServer(buildHttpsOptions(config), requestHandler)andsetSecureContext()for zero-downtime hot-swap on cert renewal. TherequestHandleris an in-process dispatcher (components/api-server/src/hfsIngress.ts) that routes^/<user>/events/<id>/seriesand^/<user>/series/batchto the HFS worker onlocalhost:4000before falling through toapp.expressApp. This is the quick / out-of-the-box ingress; for high-throughput installs, front with nginx (docs/nginx-ingress-sample.conf) and let it do the routing instead — the in-process dispatcher stays present but is bypassed because external traffic doesn't reach it.If you must front-proxy, wire the front's reload into
letsEncrypt.onRotateScriptin config so the front picks up new certs within the same renewal cycle. -
Wildcard certs are first-class.
components/business/src/acme/deriveHostnames.jsreturns{ commonName, altNames, challenge: 'dns-01' | 'http-01' }from existing topology config:Config Hosts Challenge dnsLess.isActive: true+dnsLess.publicUrlthe URL's hostname http-01dns.active: true+dns.domain: XX+*.Xdns-01core.url: https://Y/(DNSless multi-core)Yhttp-01The embedded DNS server answers
_acme-challenge.Xtransiently during DNS-01 — you don't need to integrate certbot or a third-party DNS API. -
Storage engines are pluggable at runtime. Engine choice is per-core. The config keys in
config/default-config.ymlare:storages: base: { engine: postgresql } # baseStorage + dataStore platform: { engine: rqlite } # platformStorage — always rqlite in v2 series: { engine: influxdb } # seriesStorage file: { engine: filesystem } # fileStorage audit: { engine: sqlite } # auditStorage engines: postgresql: { host: 127.0.0.1, port: 5432, database: pryv-node, user: pryv, password: '', max: 20 } # mongodb, sqlite, rqlite, influxdb, filesystem also configurable here
The
pluginLoaderreadsstorages/engines/<name>/manifest.jsonto see whichstorageTypeseach engine provides. From code:const engine = pluginLoader.getEngineModule(pluginLoader.getEngineFor('platformStorage'));
Adding an engine = new directory under
storages/engines/with amanifest.json+src/index.js. Don't reinvent — the plugin pattern exists.PostgreSQL is a first-class production engine for every
storageTypeexceptplatform, which is rqlite-only (Raft consensus is what makes multi-core work). SeeREADME-DBs.mdfor the human-readable DB layout. -
Cluster CA lifecycle. The first
master.jsboot on a fresh box mints a self-signed cluster CA under/etc/pryv/ca/. Back that directory up immediately — the private key never leaves the host, and losing it means you can't add or rotate cores.bin/bootstrap.js new-core --id <name> --ip <ip>issues a sealed AES-256-GCM-encrypted bundle + one-time join token to onboard additional cores. Bundle and passphrase travel on different channels. SeeSINGLE-TO-MULTIPLE.md. -
components/tracing/is a permanent no-op shim — keep the architectural slot, plug aDummyTracinginstance. Jaeger / OpenTracing / cls-hooked are gone (Plan 52 Phase 5.G). The 8 hot-path consumers (api-serverapplication.js/Result.js/socket-io/Manager.js, businessMethodContext.js, middlewaresetMethodId.js/setMinimalMethodContext.js, storagestorage/index.js+storages/index.js) still import fromtracing— every call collapses to aDummyTracingno-op. The hfs-server side (components/hfs-server/src/tracing/) follows the same pattern:cls.jsand the trace middleware are no-op pass-throughs. New Relic APM (Plan 38) is the active observability path and runs in parallel, not through this component. If a future tracer (OpenTelemetry, Tempo, custom) is wanted, replace the body ofcomponents/tracing/src/Tracing.jswith the real impl — the 8 consumers do not need to change.
- Don't assume MongoDB. The engine plugin tree lets operators choose; contributions that hard-code MongoDB or SQLite inside business logic will get rejected. Use
pluginLoader.getEngineModule(pluginLoader.getEngineFor('<storageType>')). - Don't add an APM agent at
require()time unconditionally. Observability (APM) is opt-in via the pluggable provider façade; the agent is bootstrapped bybin/_observability-boot.jsonly when a provider is explicitly enabled (admin CLI:bin/observability.js). Always honourNODE_ENV=testas a no-op. - Don't introduce a second TLS terminator. See truth #2.
- Don't hot-wire cert rotation with
fs.watchFileor cron. Use the existingAcmeOrchestrator→acme:rotateIPC → workerreloadTls()path. - Don't ship multi-process orchestration shims (PM2, runit, supervisord configs). master.js replaces those. If you need to restart a worker, master.js already does it (see
cluster.on('exit', ...)). - Don't write PlatformDB directly. Go through
components/platform/— it enforces config-snapshot hashes and cluster-wide semantics. Bypassing it silently desyncs cores.
@pryv/boiler layers configs (lowest → highest):
config/default-config.yml(committed defaults)config/plugins/*(derived values like system streams)${NODE_ENV}-config.ymlor the file passed via--config <path>override-config.ymlat the baseConfigDir (written bymaster.js --bootstrapon core join)- Environment variables — boiler uses
__(double underscore) as the nested-key separator (e.g.auth__adminAccessKey=…setsauth.adminAccessKey). See@pryv/boilerfor the exact mapping rules.
Understand this before debugging why a setting "isn't taking effect".
In this repo:
README.md— project overview + quick-start.INSTALL.md— operator install steps.README-DBs.md— storage-by-storage DB layout and engine selection.SINGLE-TO-MULTIPLE.md— multi-core onboarding, cluster CA, sealed bundle flow.CHANGELOG-v2.md— API-facing changes;CHANGELOG-v2-back.md— internal changes.components/business/src/acme/— ACME orchestrator + cert renewer internals.components/platform/— PlatformDB interface + rqlite specifics.storages/manifest-schema.js+storages/pluginLoader.js— how engines are loaded.storages/engines/<name>/manifest.json— what each engine provides.
External docs (pryv.github.io):
- API reference — canonical REST/WebSocket reference (full, light, admin, system variants).
- Concepts — streams, events, accesses, permissions.
- Data in Pryv — data model deep-dive.
- Event types — the curated type catalogue.
- System streams — how account fields map to streams.
- Getting started — first API calls.
- Guides — app guidelines, audit logs, consent, custom auth, data modelling, webhooks.
- FAQ API and FAQ Infra.
Operator-facing setup guides (pryv.github.io/customer-resources/):
- Infrastructure procurement — topology + sizing.
- Pryv.io setup — single-node topology.
- Single node to cluster — multi-core upgrade.
- SSL certificate — built-in ACME / Let's Encrypt.
- Backup —
bin/backup.js. - Core migration — moving a core to a new host.
- MFA — SMS-based two-factor.
- Emails setup — in-process vs microservice.
- Observability (APM) — opt-in New Relic integration.
- Healthchecks and platform validation.
- Change log.
A few operator pages on pryv.github.io (notably dns-config, audit-setup, and system-streams) still mix v1 and v2 wording; when in doubt, trust this repo's config/default-config.yml, README-DBs.md, and the bin/* admin CLIs over the rendered doc site.
- Bugs + feature requests:
pryv/open-pryv.ioGitHub Issues. - Pull requests against
master. For anything touching the cluster CA, ACME orchestrator, PlatformDB interface, or the storage plugin tree — open a draft PR early and tag a maintainer; those areas have subtle invariants that aren't obvious from a local diff.
- Read
bin/master.jstop-to-bottom. It's the single entry point and its comments explain more than this file can. - If a config key feels like it should exist but you can't find it: check
config/default-config.yml— if it's not there, it probably isn't a thing. - Test changes against both engines before assuming engine-agnostic behaviour:
just test all(PostgreSQL default) andjust test-mongo all.
— Happy hacking.