Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions apps/boekhouding-backend/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Invulhulpen backend

Fastify + Drizzle (Postgres) API voor de collaboratie-features (projecten, assessments,
comments, sync). Authenticatie via Keycloak (OIDC, JWT-bearer).

## Ontwikkelen

```bash
pnpm --filter boekhouding-backend dev # tsx watch
pnpm --filter boekhouding-backend test # vitest (vereist Postgres, zie onder)
pnpm --filter boekhouding-backend test:coverage # 100%-gate
npx tsc --noEmit # type-check (vanuit deze map)
```

Tests draaien tegen een echte Postgres. Zet `TEST_DATABASE_URL`, of laat de default
(`postgresql://parassessment:parassessment@localhost:5432/parassessment_test`) staan.

## Omgevingsvariabelen

| Variabele | Default | Beschrijving |
|---|---|---|
| `PORT` / `HOST` | `3000` / `0.0.0.0` | Luisteradres |
| `DATABASE_SERVER_FULL` | localhost-dev-URL | Postgres-connectiestring (bevat wachtwoord — niet loggen) |
| `OIDC_URL` / `OIDC_INTERNAL_URL` | `http://localhost:8080` | Publieke resp. in-cluster Keycloak-URL (JWKS gebruikt de interne) |
| `OIDC_REALM` | `invulhulpen` | Keycloak-realm |
| `OIDC_PUBLIC_CLIENT_ID` | `boekhouding-frontend` | Verwachte `azp`-claim |
| `CORS_ORIGIN` / `PUBLIC_HOST` | `http://localhost:5174` | Toegestane origin(s), comma-gescheiden lijst mogelijk |
| `TRUST_PROXY` | `1` | Aantal proxy-hops (voor `req.ip` / rate-limit) |
| `EXPOSE_API_DOCS` | `false` | Swagger UI + `/api/openapi.json` |
| **`WEB_CONCURRENCY`** | `1` | Aantal worker-processen (clustering). Standaard 1 (uit); opt-in via `> 1`. Geclampt op `[1, 64]` |
| **`DB_POOL_MAX`** | `9` | Postgres-poolgrootte **per worker**. Geclampt op `[1, 20]` (de per-user cap) |
| **`DB_CONNECT_TIMEOUT`** | `10` | Seconden voordat een nieuwe DB-verbinding faalt |
| **`DB_IDLE_TIMEOUT`** | `30` | Seconden voordat een idle DB-verbinding wordt gesloten |
| **`RATE_LIMIT_MAX`** | `300` | Verzoeken per IP per minuut (cluster-breed; zie onder) |

Ongeldige/ontbrekende waarden vallen veilig terug op de default.

## Schalen en de connectie-limiet

De gedeelde RIG-Postgres (`rig-db`) staat op `max_connections: 250` met
`reserved_connections: 10`, en — bindend voor ons — **elke project-DB-user is
gecapt op 20 connecties** (`CONNECTION LIMIT 20`, ingesteld na een incident waarbij
één project alle slots opslokte en Keycloak brak). Dat aantal van **20 is dus het
totale budget over álle pods, replica's en workers samen**.

Omdat de app I/O-bound is (lage CPU-load) en DB-werk op de DB-server draait, levert
**één worker met een gezonde pool** de beste balans — niet veel workers met mini-pools.
Let op: een **rolling deploy** draait kort twee pods naast elkaar (de oude + de
surge-pod), die allebei onder dezelfde DB-user connecties houden. Het budget is dus:

```
pods × WEB_CONCURRENCY × DB_POOL_MAX ≤ 20
```

Standaard: 1 replica + 1 surge = **2 pods** × 1 worker × `DB_POOL_MAX=9` = **18**, een
krappe marge onder 20. (Bij `Recreate`-strategie of `maxSurge=0` is er geen overlap en
mag de pool hoger; bij méér replica's of clustering navenant lager.) Wil je echt naar
veel gelijktijdige gebruikers schalen, dan is een **connection pooler (PgBouncer)** de
juiste route (al voorzien als rig-cluster-*future*) i.p.v. een grotere per-worker-pool.

De in-memory rate-limit is per worker; het entrypoint deelt `RATE_LIMIT_MAX` daarom
door het aantal workers, zodat de cluster-brede limiet bij benadering gelijk blijft.
8 changes: 7 additions & 1 deletion apps/boekhouding-backend/src/app.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@ export interface BuildAppOptions {
exposeApiDocs?: boolean
/** Fastify trustProxy value (proxy CIDR / hop count). Defaults to config.trustProxy. */
trustProxy?: string | boolean | number
/**
* Per-instance rate-limit max (requests per IP per minute). The entry point
* passes the global limit divided by the worker count, since each worker has
* its own in-memory store. Defaults to config.rateLimit.max.
*/
rateLimitMax?: number
}

export async function buildApp(options: BuildAppOptions = {}): Promise<FastifyInstance> {
Expand All @@ -44,7 +50,7 @@ export async function buildApp(options: BuildAppOptions = {}): Promise<FastifyIn
})

await app.register(cors, config.cors)
await app.register(rateLimit, { max: 300, timeWindow: '1 minute' })
await app.register(rateLimit, { max: options.rateLimitMax ?? config.rateLimit.max, timeWindow: '1 minute' })

if (exposeApiDocs) await app.register(swagger, {
openapi: {
Expand Down
42 changes: 42 additions & 0 deletions apps/boekhouding-backend/src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,54 @@ function parseTrustProxy(): number | string {
return /^\d+$/.test(v) ? Number(v) : v
}

// Parse a positive-integer env var, clamped to [1, max]. Falls back to the
// default when unset, non-numeric, or below 1 — so a misconfigured value can
// never produce an unsafe state (e.g. a pool of 0 or an absurdly large value).
function parsePositiveInt(value: string | undefined, fallback: number, max: number): number {
if (!value) return fallback
const n = parseInt(value, 10)
if (!Number.isFinite(n) || n < 1) return fallback
return Math.min(n, max)
}

// Worker-process count for clustering. Returns null when unset/invalid so the
// entry point can default to one worker per CPU core; an explicit value is
// clamped to [1, 64] (1 effectively disables clustering). The clamp prevents a
// misconfigured value from fork-bombing the host.
function parseWebConcurrency(): number | null {
const v = process.env.WEB_CONCURRENCY
if (!v) return null
const n = parseInt(v, 10)
if (!Number.isFinite(n) || n < 1) return null
return Math.min(n, 64)
}

export const config = {
port: parseInt(process.env.PORT || '3000', 10),
host: process.env.HOST || '0.0.0.0',
webConcurrency: parseWebConcurrency(),
exposeApiDocs: process.env.EXPOSE_API_DOCS === 'true',
trustProxy: parseTrustProxy(),
databaseUrl: process.env.DATABASE_SERVER_FULL || 'postgresql://parassessment:parassessment@localhost:5432/parassessment',
// Postgres connection pool, PER worker process. The RIG shared Postgres caps
// each project DB user at 20 connections total (see README), and a rolling
// deploy briefly runs two pods (old + surge), so the budget is:
// pods × WEB_CONCURRENCY × DB_POOL_MAX ≤ 20.
// Default 9 → 2 pods × 1 worker × 9 = 18, a tight margin under 20. The app is
// I/O-bound so one worker with this pool is plenty; the ceiling is the cap.
// Raise the pool / add workers/replicas only within that budget, or put a
// connection pooler (PgBouncer) in front.
db: {
max: parsePositiveInt(process.env.DB_POOL_MAX, 9, 20),
connectTimeout: parsePositiveInt(process.env.DB_CONNECT_TIMEOUT, 10, 300),
idleTimeout: parsePositiveInt(process.env.DB_IDLE_TIMEOUT, 30, 86400),
},
// Global request limit per IP per minute. Under clustering the in-memory store
// is per worker, so the entry point divides this across workers to keep the
// cluster-wide limit close to `max`.
rateLimit: {
max: parsePositiveInt(process.env.RATE_LIMIT_MAX, 300, 100000),
},
cors: {
origin: corsOrigin,
credentials: true,
Expand Down
6 changes: 5 additions & 1 deletion apps/boekhouding-backend/src/db/connection.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ import postgres from 'postgres'
import { config } from '../config.js'
import * as schema from './schema.js'

const queryClient = postgres(config.databaseUrl)
const queryClient = postgres(config.databaseUrl, {
max: config.db.max,
connect_timeout: config.db.connectTimeout,
idle_timeout: config.db.idleTimeout,
})

export const db = drizzle(queryClient, { schema })
33 changes: 27 additions & 6 deletions apps/boekhouding-backend/src/index.ts
Original file line number Diff line number Diff line change
@@ -1,11 +1,32 @@
import cluster from 'node:cluster'
import { buildApp } from './app.js'
import { config } from './config.js'

const app = await buildApp()
// Single worker by default. The app is I/O-bound (low CPU), and the shared
// Postgres caps this DB user at 20 connections total, so each extra worker
// multiplies connection pressure (workers × DB_POOL_MAX). Opt into clustering
// by setting WEB_CONCURRENCY > 1, and then lower DB_POOL_MAX so that
// WEB_CONCURRENCY × DB_POOL_MAX stays within the budget (see README/config.ts).
const workers = config.webConcurrency ?? 1

try {
await app.listen({ port: config.port, host: config.host })
} catch (err) {
app.log.error(err)
process.exit(1)
if (workers > 1 && cluster.isPrimary) {
// Migrations already ran once before this process started (the container CMD
// runs `migrate && index`), so the primary only supervises workers.
for (let i = 0; i < workers; i++) cluster.fork()
cluster.on('exit', (worker, code, signal) => {
console.error(`worker ${worker.process.pid} exited (${signal || code}); starting a replacement`)
cluster.fork()
})
} else {
// Each worker keeps its own in-memory rate-limit store, so divide the global
// limit across workers to keep the cluster-wide limit close to the configured value.
const rateLimitMax = Math.max(1, Math.ceil(config.rateLimit.max / workers))
const app = await buildApp({ rateLimitMax })

try {
await app.listen({ port: config.port, host: config.host })
} catch (err) {
app.log.error(err)
process.exit(1)
}
}
19 changes: 18 additions & 1 deletion apps/boekhouding-backend/src/middleware/auth.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import { db } from '../db/connection.js'
import { users } from '../db/schema.js'
import { eq, and, isNull } from 'drizzle-orm'
import { config } from '../config.js'
import { userIdCache } from '../utils/userIdCache.js'

export interface AuthUser {
id: string
Expand Down Expand Up @@ -33,7 +34,7 @@ export async function requireAuth(request: FastifyRequest, reply: FastifyReply)

const token = authHeader.slice(7)

let payload: { sub?: string; email?: string; name?: string; preferred_username?: string; azp?: string }
let payload: { sub?: string; email?: string; name?: string; preferred_username?: string; azp?: string; exp?: number }
try {
const result = await jwtVerify(token, jwks, {
issuer: config.keycloak.issuer,
Expand Down Expand Up @@ -73,6 +74,18 @@ export async function requireAuth(request: FastifyRequest, reply: FastifyReply)

const displayName = payload.name || payload.preferred_username || payload.email

// Identity cache: the token is already fully validated above (signature,
// issuer, azp, exp), so a hit only skips the users-lookup — never validation.
// Authorization is still checked live downstream, so a cache hit cannot leak
// access. On a hit, email/displayName come from this request's token, so no
// personal data is kept in the cache itself.
const now = Date.now()
const cachedId = userIdCache.get(payload.sub, now)
if (cachedId !== undefined) {
request.user = { id: cachedId, email: payload.email, displayName }
return
}

// Find or create user by OIDC subject
let [user] = await db
.select({ id: users.id, email: users.email, displayName: users.displayName })
Expand Down Expand Up @@ -149,6 +162,10 @@ export async function requireAuth(request: FastifyRequest, reply: FastifyReply)
}
}

// Cache the resolved id. The TTL is bounded and never outlives the token, so
// a stale identity (or a removed user) can persist for at most the TTL.
userIdCache.set(payload.sub, user.id, payload.exp, now)

request.user = {
id: user.id,
email: user.email,
Expand Down
54 changes: 54 additions & 0 deletions apps/boekhouding-backend/src/utils/userIdCache.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
// In-memory cache mapping an OIDC subject (`sub`) to our internal user id, so
// that authenticated polling clients don't trigger a users-lookup on every
// request. Security/privacy properties (see the PR scaling audit):
//
// - Stores ONLY the internal id — never email/displayName. The per-request
// token carries those, so no personal data lingers in process memory
// (AVG dataminimalisatie).
// - Caches nothing about authorization. Project/assessment access is always
// checked live against the database, so revoking access takes effect at once.
// - An entry never outlives its token: the TTL is capped at `maxTtlMs` AND at
// the token's own remaining lifetime.
// - Bounded size with oldest-first eviction, so a flood of distinct subjects
// cannot exhaust memory (availability / DoS).
// - The cache only ever maps to an already-resolved id; on any uncertainty the
// caller falls back to the database lookup — it never grants access on its own.
export interface UserIdCache {
/** Returns the cached id for `oidcSub`, or undefined on a miss/expired entry. */
get(oidcSub: string, now: number): string | undefined
/** Caches `id` for `oidcSub`. `tokenExpSeconds` is the JWT `exp` claim (seconds). */
set(oidcSub: string, id: string, tokenExpSeconds: number | undefined, now: number): void
clear(): void
}

export function createUserIdCache(maxEntries = 10_000, maxTtlMs = 60_000): UserIdCache {
const store = new Map<string, { id: string; expiresAt: number }>()
return {
get(oidcSub, now) {
const entry = store.get(oidcSub)
if (!entry) return undefined
if (entry.expiresAt <= now) {
store.delete(oidcSub)
return undefined
}
return entry.id
},
set(oidcSub, id, tokenExpSeconds, now) {
let ttl = maxTtlMs
if (tokenExpSeconds !== undefined) {
ttl = Math.min(ttl, tokenExpSeconds * 1000 - now)
}
if (ttl <= 0) return
if (store.size >= maxEntries && !store.has(oidcSub)) {
// Map preserves insertion order, so the first key is the oldest.
store.delete(store.keys().next().value as string)
}
store.set(oidcSub, { id, expiresAt: now + ttl })
},
clear() {
store.clear()
},
}
}

export const userIdCache = createUserIdCache()
12 changes: 12 additions & 0 deletions apps/boekhouding-backend/test/cov/app.cov.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,18 @@ describe('buildApp — options handling', () => {
await defaultApp.close()
}
})

it('applies the rateLimitMax option (per-worker limit) instead of the config default', async () => {
const limitedApp = await buildApp({ logger: false, rateLimitMax: 1 })
await limitedApp.ready()
try {
expect((await limitedApp.inject({ method: 'GET', url: '/api/health' })).statusCode).toBe(200)
// A second request from the same client exceeds the per-instance limit of 1.
expect((await limitedApp.inject({ method: 'GET', url: '/api/health' })).statusCode).toBe(429)
} finally {
await limitedApp.close()
}
})
})

describe('API_VERSION constant', () => {
Expand Down
Loading
Loading