Skip to content

feat(audit): instrumentation des actions utilisateurs (#3191)#3193

Merged
Viczei merged 13 commits intoalphafrom
split/feat/3174-user-actions-log/02-instrumentation
Apr 9, 2026
Merged

feat(audit): instrumentation des actions utilisateurs (#3191)#3193
Viczei merged 13 commits intoalphafrom
split/feat/3174-user-actions-log/02-instrumentation

Conversation

@Viczei
Copy link
Copy Markdown
Contributor

@Viczei Viczei commented Apr 8, 2026

Summary

Branche l'infrastructure d'audit logging livrée par #3192 (#3190) à tous les points d'action existants. Une fois cette PR mergée, les actions utilisateurs commencent à s'écrire en base.

⚠️ Ne pas merger avant #3192 — cette PR cible la branche split/feat/3174-user-actions-log/01-infrastructure. Une fois #3192 mergé sur alpha, GitHub re-targettera automatiquement cette PR sur alpha.

tRPC pipeline

  • Plug du middleware auditMiddleware sur publicProcedure et protectedProcedure → log automatique de toutes les mutations
  • Lectures sensibles loguées explicitement via mapping path-based : profile.get, declaration.getOrCreate (renvoie GIP MDS)

Auth (NextAuth)

  • Event signInauth.login (success)
  • logger.errorauth.login_failed avec whitelist OAuth-safe (évite la fuite de tokens / state / code_verifier)
  • Custom /api/auth/logout → capture identité avant invalidation du cookie

9 routes wrappées avec withAuditedRoute

  • Lectures sensibles (rétention 180j) : declaration-pdf, transmitted-pdf, no-sanction-pdf
  • Mutations (rétention 365j) : upload
  • Exports / API SUIT (rétention 365j) : export/download, export/generate, v1/export/declarations, v1/files
  • Système (rétention 365j) : gip-mds/import

Sécurité

  • /api/v1/files : validation du siren query param via parseSiren avant écriture en base (évite l'injection de varchar arbitraire dans la colonne siren)

Closes #3191
Lié à #3174 (parent) et #3192 (infrastructure, à merger en premier).

Quality gates

  • Typecheck — 0 erreurs
  • Tests — 1107 / 1107 passent
  • Lint / Format
  • Structural / RGAA / Security audit (HIGH + 3 MEDIUM corrigés sur les 2 PRs)

Test plan (sur env de review, après merge de #3192)

  • Se connecter via ProConnect (test@fia1.fr) et vérifier qu'une ligne auth.login est écrite
  • Soumettre une déclaration et vérifier qu'une ligne declaration.submit est écrite avec les bonnes métadonnées
  • Télécharger un PDF de récap et vérifier qu'une ligne pdf.declaration_download est écrite (catégorie read_sensitive)
  • Se déconnecter et vérifier qu'une ligne auth.logout est écrite
  • Tester un login en échec (mauvais token ProConnect) et vérifier auth.login_failed
  • SELECT category, action, count(*) FROM audit.action_log GROUP BY category, action pour valider les catégories

🤖 Generated with Claude Code

Set up the foundation for user-action audit logging without
instrumenting any existing code path. The audit table is created on
deploy but stays empty until the instrumentation PR (#3191) lands.

- New PostgreSQL schema `audit` with `audit.action_log` table (Drizzle
  schema + idempotent migration); decoupled from `app.users` so GDPR
  user deletion does not block audit retention.
- New isomorphic module `~/modules/audit` (action keys, retention
  constants, types) — single source of truth for category mapping.
- Server-side helpers in `~/server/audit`:
  * `logAction` — fail-safe writer (swallows DB errors so business
    logic is never blocked)
  * `requestContext` — IP / user-agent extraction
  * `trpcMiddleware` — generic middleware with recursive metadata
    sanitization (strips token/password/secret/apikey at any depth)
  * `withAuditedRoute` — Next.js Route Handler wrapper
  * `cleanup` — RGPD purge with two CNIL-compliant retention buckets
    (180d for read_sensitive, 365d for security logs)
- New endpoint `POST /api/audit/cleanup` (fail-closed when token unset)
- New K8s CronJob (`audit-cleanup-daily`, runs at 4am UTC)
- Env vars: EGAPRO_AUDIT_CLEANUP_TOKEN, EGAPRO_AUDIT_RETENTION_SHORT_DAYS
  (180), EGAPRO_AUDIT_RETENTION_LONG_DAYS (365)

Closes #3190
@Viczei Viczei requested a review from a team as a code owner April 8, 2026 16:10
@revu-bot revu-bot Bot requested a review from revu-bot April 8, 2026 16:10
Copy link
Copy Markdown
Collaborator

@revu-bot revu-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Audit Instrumentation PR Review

This PR wires the audit logging infrastructure (from #3192) to all user action points. The overall approach is sound — withAuditedRoute wrapper, tRPC middleware, and explicit auth event hooks are well-structured. A few issues need attention before merging.

File Lines Severity Issue
auth/logout/route.ts 30–43 CRITICAL Logout audit fires even when token is null (unauthenticated request)
auth/logout/route.ts 30–43 IMPORTANT void logAction(...) is fire-and-forget before session termination — logout can complete before audit write
declaration-pdf/route.ts 11–27 IMPORTANT auth() called twice per request (once in resolveContext, once in handler)
auth/config.ts 270–293 IMPORTANT logger.error is async but NextAuth's logger interface may not await it
no-sanction-pdf/route.ts 14–22 MINOR resolveContext ignores its _request parameter — inconsistent with other routes

Critical Issues

  1. Unauthenticated logout audit: The logout route logs auth.logout even when token is null (no active session). This creates spurious audit entries for unauthenticated GET requests to /api/auth/logout.

Important Issues

  1. Double auth() calls: declaration-pdf, transmitted-pdf, and upload routes call auth() in both resolveContext and the handler body. This doubles the session resolution overhead per request.
  2. Async logger concern: NextAuth's logger.error signature is synchronous in v4 — making it async may cause the promise to be silently dropped, meaning safeRequestContext() (which awaits nextHeaders()) might not resolve correctly.

Overall Quality

The security-conscious design (whitelist-based error sanitization, parseSiren validation before DB writes, capturing identity before session invalidation) is solid. Test coverage for the no-sanction-pdf route is updated correctly. The main gaps are the null-token guard on logout and the redundant auth() calls.

@Viczei Viczei marked this pull request as draft April 8, 2026 16:15
@Viczei Viczei self-assigned this Apr 8, 2026
@Viczei Viczei changed the base branch from split/feat/3174-user-actions-log/01-infrastructure to alpha April 8, 2026 18:44
@Viczei Viczei temporarily deployed to build-review-auto April 9, 2026 09:04 — with GitHub Actions Inactive
Make the cleanup bearer token mandatory instead of optional, so a
missing secret crashes the app at boot rather than silently exposing
the destructive `/api/audit/cleanup` endpoint. Validation requires at
least 32 characters (same bar as EGAPRO_SUIT_API_KEY).

The runtime `if (!token)` fail-closed check in the route handler
becomes dead code once env.js enforces presence — drop it to keep the
handler focused on the bearer comparison.

Every environment (dev / preprod / prod) must now have the secret
sealed via kubeseal before deploy.
Seal the audit cleanup bearer token per environment (dev / preprod /
prod) so the app can start after the fix that made the var mandatory.
Without these, the Zod validation in env.js would crash the pod at
boot on every environment.
Five fixes from the revu-bot review on #3192:

- **cleanup.ts** — wrap the two DELETEs in a single transaction so a
  failure in the second statement rolls back the first, avoiding a
  silent partial purge (CRITICAL).
- **cleanup.ts** — explicitly coerce `count` via `Number()` because
  postgres-js v3 returns it as a string; the previous code produced
  string concatenation when summing short + long counts (IMPORTANT).
- **trpcMiddleware.ts** — extend `SENSITIVE_KEYS` with `refresh_token`,
  `client_secret`, `accesskey`, `access_key`, `private_key` so the
  recursive sanitizer catches every common credential shape before it
  hits the audit jsonb column (IMPORTANT).
- **audit-cleanup-cron.yaml** — replace `curl --fail` (which loses the
  HTTP status) with explicit `%{http_code}` capture + clear stderr
  logging, and drop `backoffLimit` from 3 to 0 so a single failure no
  longer retries and generates alert noise (the daily schedule is
  enough retry granularity for this job) (IMPORTANT).
- **requestContext.ts** — trim `x-real-ip` and `user-agent` for
  consistency with the `x-forwarded-for` branch (MINOR).
Viczei added a commit that referenced this pull request Apr 9, 2026
Four fixes from the revu-bot review on #3193:

- **auth/logout/route.ts** — gate the `auth.logout` audit behind a
  token presence check so unauthenticated GETs no longer produce
  spurious rows. Also switch the `void logAction(...)` to `await` so
  the write is guaranteed to be persisted before the redirect is
  issued (logAction is fail-safe internally, a DB outage still won't
  block logout) (CRITICAL + IMPORTANT).
- **auth/config.ts** — convert the `logger.error` callback from
  `async` to synchronous + detached IIFE. NextAuth v4's logger
  interface is sync; returning a promise risks the side-effect being
  silently dropped in serverless or cold-start contexts (IMPORTANT).
- **audit/cachedAuth.ts (new)** — per-request memoisation of `auth()`
  via `WeakMap<Request, Promise<Session>>`, so routes that need the
  session in both `resolveContext` and the handler body only parse
  the JWT once. Applied to declaration-pdf, transmitted-pdf,
  no-sanction-pdf, and upload routes (IMPORTANT).

Adds:
- Two new logout-route tests (no audit on null token, audit with
  identity on valid session).
- Full cachedAuth unit coverage (single-call dedup, multi-request
  isolation, concurrent in-flight promise sharing).
@Viczei Viczei temporarily deployed to build-review-auto April 9, 2026 09:38 — with GitHub Actions Inactive
@Viczei Viczei temporarily deployed to build-review-auto April 9, 2026 10:30 — with GitHub Actions Inactive
The two DELETEs used a raw \`sql\` template tag to interpolate the
Date thresholds:

    sql\`... AND \${actionLogs.createdAt} < \${shortThreshold}\`

postgres-js does not convert a Date value to a bind parameter string
inside a raw template tag (it only does the coercion for columns
referenced through the drizzle schema), so the cleanup endpoint
crashed at runtime with:

    TypeError: The "string" argument must be of type string or an
    instance of Buffer or ArrayBuffer. Received an instance of Date

The bug was masked by the unit tests because they mocked
\`db.transaction\` / \`db.delete\` — the raw SQL was never actually
executed. It was caught by a manual Playwright+curl end-to-end run
against a review namespace.

Switch to drizzle's typed predicates (\`and\` / \`eq\` / \`lt\` / \`ne\`)
which correctly bind the Date params. Same semantics, no more
TypeError at runtime.
@Viczei Viczei temporarily deployed to build-review-auto April 9, 2026 11:19 — with GitHub Actions Inactive
Adds a Vitest integration layer that runs against a real Postgres
container started via testcontainers. Only `cleanupAuditLogs` is
covered for now — it is the first function that benefited from this
type of test after a `Date` → `sql\`\`` bug slipped past the unit
tests and crashed on the review app.

Infrastructure:
- `vitest.integration.config.ts` — separate config, `environment: node`,
  `include: "*.integration.test.ts"`, larger timeouts, serial file
  execution, no `~/env` mock.
- `src/test/integration-setup.ts` — `globalSetup` that boots a Postgres
  16 container, runs the real Drizzle migrations against it, and
  exposes the connection URI via `process.env.DATABASE_URL` before any
  test file imports `~/server/db`.
- `src/test/integration-per-file-setup.ts` — minimal per-file setup
  that only mocks `server-only` (the Next.js lint package). Everything
  else (db, env, audit helpers) loads for real.
- `vitest.config.ts` — excludes `*.integration.test.ts` from the
  standard unit run so `pnpm test` stays fast and Docker-free.
- `package.json` — new `test:integration` script.
- `.claude/hooks/block-bad-patterns.sh` — allow `process.env` in
  `integration-setup.ts` (same exception as `env.js`, `drizzle.config`,
  `global-setup.ts`, etc.) since we have to mutate it before `~/env`
  loads.

Test coverage (`src/server/audit/__tests__/cleanup.integration.test.ts`):
- empty table → noop
- short retention purge (read_sensitive only)
- long retention purge (non-read_sensitive only)
- mixed retention buckets in a single run
- regression guard: the exact `Date` bind param path that crashed the
  previous revision with `TypeError: The "string" argument must be of
  type string … Received an instance of Date`. Temporarily reverting
  `cleanupAuditLogs` to the raw `sql\`\`` template makes all 5 tests
  fail with that exact TypeError — regression locked in.

Cost: container boot + migrations ≈ 5 s, tests ≈ 0.1 s, total ≈ 6 s.
Runs only via `pnpm test:integration`, requires Docker.
@Viczei Viczei temporarily deployed to build-review-auto April 9, 2026 12:18 — with GitHub Actions Inactive
Viczei added 2 commits April 9, 2026 14:19
The audit cleanup token is now required at startup (\`z.string().min(32)\`
in env.js) after the #3190 fix. The e2e workflow runs a real
\`pnpm build\` without \`SKIP_ENV_VALIDATION\`, so the build crashed
on both PRs with:

    ❌ Invalid environment variables: [
      { path: [ 'EGAPRO_AUDIT_CLEANUP_TOKEN' ],
        message: 'Invalid input: expected string, received undefined' }
    ]

The e2e tests do not exercise \`/api/audit/cleanup\` (it is only called
by the K8s CronJob in real deployments), so a hard-coded dummy value
that satisfies the 32-char minimum is sufficient. Using a dummy string
avoids the friction of adding a new GitHub secret just to unblock CI.

\`ci.yaml\` is unaffected — it builds with \`SKIP_ENV_VALIDATION: "1"\`
which bypasses all env validation.
@Viczei Viczei marked this pull request as ready for review April 9, 2026 12:40
@Viczei Viczei requested a review from a team as a code owner April 9, 2026 12:40
Copy link
Copy Markdown
Collaborator

@revu-bot revu-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ PR Review Skipped

1 validation issue found. Review thresholds can be adjusted in .revu.yml.

See why it was skipped and detailed metrics

Issues Found

1. This PR changes 52 files, which exceeds the limit of 25 files.

Suggestion: Consider breaking this PR into smaller, more focused changes. Large PRs are harder to review effectively and may contain unrelated changes.

PR Metrics

  • Total files changed: 52
  • Reviewable files: 52
  • Diff size: 4328 lines
  • Documentation files: 0
  • Largest file change: 1196 lines
  • Addition/Deletion ratio: 30.16

This validation helps ensure the bot focuses on PRs where automated review provides the most value.

@Viczei Viczei temporarily deployed to build-review-auto April 9, 2026 12:41 — with GitHub Actions Inactive
@Viczei Viczei temporarily deployed to build-review-auto April 9, 2026 13:02 — with GitHub Actions Inactive
Every new tRPC mutation, sensitive query, Route Handler, auth event or
cron-triggered action must now ship with its matching audit log entry.
Skipping the log is a compliance bug, not a missing enhancement —
document the three wire-up points so future PRs (human or agent) don't
forget them.

- **New rule** `.claude/rules/audit-logging.md`: full playbook scoped
  via frontmatter paths (`routers/**`, `app/api/**/route.ts`, auth and
  audit modules) so it auto-loads when touching an audited surface.
  Covers the 5 surface types (mutation / sensitive query / Route
  Handler / auth event / system), the 3-step wire-up
  (`AUDIT_ACTIONS` / `AUDIT_ACTION_CATEGORIES` / path map), metadata
  sanitisation rules, category→retention mapping, test coverage
  expectations, and a PR checklist.

- `.claude/rules/automation.md`: new **Audit logging** bullet list in
  the "While writing — inline rules" section with a pointer to the
  full rule file.

- `packages/app/CLAUDE.md`: new **Audit logging (issue #3174)**
  section between "Forms" and "File size" that summarises the
  covered surfaces, the 3 wire-up points, the sanitisation contract,
  and the integration-test requirement for DB-layer changes in
  `cleanup.ts`.

Docs only, no code change.
@LucasCharrier LucasCharrier changed the base branch from alpha to split/feat/3174-user-actions-log/01-infrastructure April 9, 2026 14:07
@LucasCharrier LucasCharrier changed the base branch from split/feat/3174-user-actions-log/01-infrastructure to alpha April 9, 2026 14:07
@Viczei Viczei changed the base branch from alpha to split/feat/3174-user-actions-log/01-infrastructure April 9, 2026 14:33
@Viczei Viczei force-pushed the split/feat/3174-user-actions-log/02-instrumentation branch from b1ac75a to 4d1c731 Compare April 9, 2026 14:35
@Viczei Viczei temporarily deployed to build-review-auto April 9, 2026 14:35 — with GitHub Actions Inactive
Viczei added 4 commits April 9, 2026 16:36
Wire the audit logging infrastructure (#3190) into every existing
action point. After this lands, user actions start being written to
`audit.action_log`.

- tRPC: plug `auditMiddleware` on `publicProcedure` and
  `protectedProcedure` — every mutation is logged automatically; a
  static path-based map adds explicit sensitive-query coverage
  (`profile.get`, `declaration.getOrCreate`).
- NextAuth `events.signIn` → `auth.login` (success).
- NextAuth `logger.error` → `auth.login_failed` with an OAuth-safe
  whitelist (avoids leaking tokens / state / code_verifier into the
  audit table).
- Custom `/api/auth/logout` route → captures user identity *before*
  invalidating the session cookie.
- 9 Next.js Route Handlers wrapped with `withAuditedRoute`:
  * read_sensitive (180d retention): declaration-pdf, transmitted-pdf,
    no-sanction-pdf
  * mutation: upload
  * export: export/download, export/generate, v1/export/declarations,
    v1/files
  * system: gip-mds/import
- `/api/v1/files`: validate the `siren` query param via `parseSiren`
  before storing it (defends the audit `siren` column from arbitrary
  varchar injection).

Closes #3191
Four fixes from the revu-bot review on #3193:

- **auth/logout/route.ts** — gate the `auth.logout` audit behind a
  token presence check so unauthenticated GETs no longer produce
  spurious rows. Also switch the `void logAction(...)` to `await` so
  the write is guaranteed to be persisted before the redirect is
  issued (logAction is fail-safe internally, a DB outage still won't
  block logout) (CRITICAL + IMPORTANT).
- **auth/config.ts** — convert the `logger.error` callback from
  `async` to synchronous + detached IIFE. NextAuth v4's logger
  interface is sync; returning a promise risks the side-effect being
  silently dropped in serverless or cold-start contexts (IMPORTANT).
- **audit/cachedAuth.ts (new)** — per-request memoisation of `auth()`
  via `WeakMap<Request, Promise<Session>>`, so routes that need the
  session in both `resolveContext` and the handler body only parse
  the JWT once. Applied to declaration-pdf, transmitted-pdf,
  no-sanction-pdf, and upload routes (IMPORTANT).

Adds:
- Two new logout-route tests (no audit on null token, audit with
  identity on valid session).
- Full cachedAuth unit coverage (single-call dedup, multi-request
  isolation, concurrent in-flight promise sharing).
…o split/feat/3174-user-actions-log/02-instrumentation
@Viczei Viczei temporarily deployed to build-review-auto April 9, 2026 16:22 — with GitHub Actions Inactive
@tokenbureau
Copy link
Copy Markdown

tokenbureau Bot commented Apr 9, 2026

Base automatically changed from split/feat/3174-user-actions-log/01-infrastructure to alpha April 9, 2026 16:47
@Viczei Viczei merged commit 466d4f1 into alpha Apr 9, 2026
15 checks passed
@Viczei Viczei deleted the split/feat/3174-user-actions-log/02-instrumentation branch April 9, 2026 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

V2 - Log des actions — Instrumentation des actions utilisateurs

3 participants