Working notes for Claude Code sessions in this repo. Read this first.
Claude must not create tiny incremental pull requests.
Default behavior:
- One Claude build session = one branch = one pull request.
- Batch related fixes together.
- Do not create follow-up PRs for small fixes.
- Update the existing PR instead.
- Leave unrelated or risky work as deferred notes.
- Final merge is always left to Freddie.
Future Horizons ASPT — a procurement and capability intelligence platform for UK higher-ed customers, built around UCISA HERM v3.1 (165 capabilities, 21 systems, 5-language i18n) plus the proprietary FHE Capability Framework.
Surface includes: capability scoring, baskets, sector analytics, vendor portal, multi-jurisdiction procurement, Stripe billing, AI assistant, SAML + OIDC SSO with multi-IdP per tenant, SCIM 2.0 provisioning, soft-delete cascade with retention scheduler, and a tiered tenancy model (free / professional / enterprise).
The product is feature-complete for v1 as of Phase 11. Phase 12 is the
operational-maturity phase (see progress/phase-12-kickoff.md).
Monorepo via npm workspaces (Node 20 baseline, ESM-only):
| Workspace | Path | Stack |
|---|---|---|
| Server | server/ |
TypeScript + Express + Prisma + Postgres + Vitest. Optional Redis. |
| Client | client/ |
TypeScript + React 18 + Vite + React Router + Tailwind + Vitest. |
| Schema | prisma/ |
Single schema.prisma + numbered migrations under prisma/migrations/. |
- Auth: JWT sessions (
jsonwebtoken); per-tenant SAML via@node-saml/node-saml; per-tenant OIDC viaopenid-clientv6 (ESM); API keys (herm_pk_…prefix); SCIM 2.0 with API-key +admin:scimpermission. - Persistence: Prisma 5.22 + Postgres 16. In-memory
Map+ Redis fallback for short-lived state (lockout counters, OIDC PKCE flow, SAML SLO replay, session JTI store). - Observability: structured JSON logs with
req.idcorrelation. - CI: GitHub Actions — lint, typecheck, test (with Postgres + Redis service containers), build, prisma-validate.
- Pre-existing review bots: Cursor Bugbot + GitHub Copilot review every PR. Treat their findings as actionable; Bugbot is the higher-signal of the two for behaviour bugs.
Pattern: claude/p<phase>-<n>-<topic>. Examples:
claude/p11-pr5-softdelete-cache-audit— Phase 11 audit follow-up #5claude/p12-2-metrics-endpoint— Phase 12 sub-phase 2claude/add-claude-md— single-purpose docs PR, no phase prefix needed
Always branch off master. Do not branch off another open PR's branch unless the work is genuinely a follow-up to an unmerged commit (rare).
Conventional Commits (feat, fix, chore, refactor, ci, docs).
Always end with the Claude session URL trailer:
feat(observability): Prometheus /metrics endpoint + auth/SSO counters (Phase 12.2)
[multi-paragraph body explaining why]
https://claude.ai/code/session_<id>
Long bodies are encouraged for non-trivial changes — they become the diff
description on github.com and the source of truth for why. Match the style
of recent merged commits in git log.
Sections in this order: Summary → What changed (subsections per
file/concern) → Test plan (checkbox list, with checked items reflecting
local-verify proof) → Out of scope / deferred. Always link to relevant
ADRs / RUNBOOK.md sections for invariants the change interacts with.
Open PRs as draft when they need triage discussion; ready-for-review only
when the diff is final. The mcp__github__create_pull_request parameter is
draft: true (boolean — passing "true" as a string fails).
npm run verify # lint + typecheck + test:ci + build (~60s)Must be green locally. Do not push if verify fails — CI will reject and you
burn one of the 5-PR-batch slots in the user's review queue.
- A "PR" is one logical change. Every commit on the branch should be in service of the PR's stated goal.
- Review-comment fix-ups go on the same branch as a follow-up commit, not a new PR. Push the commit, reply on the thread, let CI re-run.
- If you discover a separate bug while working on a PR, jot it in the PR's "Out of scope / deferred" section and surface it back to Freddie. Do not silently expand scope.
Read these once per session before touching the relevant surface:
- HERM compliance (
HERM_COMPLIANCE.md): HERM capability data is free at every tier including anonymous; provenance / attribution travels with the data;tierGatepasses public frameworks through unconditionally. Performance work must not introduce a "fast path" that bypasses these. - SSO architecture (
docs/adr/0001-sso-architecture.md): opaque 404 (not 403) on SSO tier-rejection; IdP owns SSO credential-failure lockout; MFA bypass on SSO is intentional (mfaBypassed: trueaudited). Observability must not leak the rejection reason via metric labels or trace tags. Never label a metric or audit row withinstitutionSlugon the SSO surface. - Production posture (
PRODUCTION_READINESS.md+RUNBOOK.md): proposed SLOs (99.5% availability, p95 < 500ms read / 1500ms AI chat); tracing / alerting destinations are open questions awaiting product-owner input.
These patterns came out of merged PRs (#79–#88). Reuse them rather than reinventing.
Every process-local Map-based cache must have:
- A bounded
MAX_CACHE_SIZE. - Active eviction of expired entries on every miss-set (
pruneExpired(now)), not just the lazy TTL check onget. - Either FIFO eviction (cheap, fine when "one entry per active user") or true LRU touch (delete + reinsert on hit) when the workload has hot keys.
Examples: server/src/api/sso/oidc.ts (LRU, OIDC config),
server/src/api/scim/scim.router.ts (FIFO, negative auth cache).
Redis key construction is not yet centralised in a single
server/src/lib/redis-keys.ts helper in this tree. When changing or adding
Redis namespaces, follow the existing local builder/constant pattern in the
relevant module and avoid introducing new scattered prefix string literals
across modules.
Action namespace is <surface>.<event>.<outcome?>:
auth.login.success/auth.login.failauth.sso.slo_success/auth.sso.slo_failadmin.sso.create/admin.sso.update/admin.sso.deleteadmin.sso.read_self/admin.sso.read_all/admin.sso.read_by_institutionkeys.create/keys.revoke
audit() from server/src/lib/audit.ts is best-effort — it never throws;
losing an audit row is regrettable, losing the action because the audit
write failed would be worse. await it inline; do not fire-and-forget
unless the audit write is genuinely outside the request critical path.
server/src/lib/metrics.ts — every metric uses the herm_ prefix. Labels
are tight, low-cardinality, and never include user / institution / JWT
claim values. Route labels collapse dynamic IDs via Express's matched
route pattern (/api/users/:id). Unmatched paths get the __not_found
sentinel.
When a module holds module-level state (caches, registries, in-memory stores), export a reset hook so tests can isolate. Naming convention:
__resetXForTests()— for module-internal state with double-underscore to clearly mark "not for production code"_resetXForTests()— single-underscore variant (older code; either is fine for new modules)__XSizeForTests()— read-only inspection of internal state
- Server:
server/vitest.config.ts—nodeenvironment,fileParallelism: false. - Client:
client/vitest.config.ts—jsdomenvironment. - Setup files:
server/src/test/setup.ts,client/src/__tests__/setup.ts.
isUserSoftDeleted (in server/src/middleware/auth.ts) skips the DB lookup
in test mode by default to avoid consuming queued prisma.user.findUnique
mocks across the suite. Tests that specifically pin the revocation behaviour
must set process.env.ENABLE_SOFT_DELETE_AUTH_CHECK = 'true' in beforeEach
and clean up in afterEach (see server/src/__tests__/softDeleteRevocation.test.ts).
encryptSecret in server/src/lib/secret-cipher.ts throws when
SSO_SECRET_KEY is unset and a non-empty plaintext arrives. Tests that
exercise SSO write paths with a samlCert or oidcClientSecret must
wrap in a try/finally that sets the env var and resets the cipher key
cache (see server/src/api/admin/admin-sso.test.ts's "secret-only rotation"
test for the canonical pattern).
If a test only needs to verify control flow (not encryption), prefer to omit secret material from the request body — most controllers preserve existing values when fields are absent ("keep-or-replace" semantics).
The prismaMock pattern in server/src/__tests__/scimAuthHardening.test.ts
is the reference. SCIM uses apiKey.findUnique with a nested
institution: { select: { deletedAt: true } } join — mock the whole shape
or auth fails opaquely.
(none currently — Phase 16 Path A debt paydown PRs #152 + #153 cleared the lint + ukamf baselines. If you see a new pattern repeating across PRs, add a section here.)
ssoDiscoveryRateLimit.test.tstypecheck patch — resolved when the cascade of #83–#88 landed. Was a two-line import + path-assert fix applied on every branch off master pre-resolution.- 6
react-hooks/set-state-in-effectlint errors in NotificationBell, UpgradeCard, AuthContext, FrameworkContext, SidebarContext, AdminSso — cleared by PR #152 (Phase 16 Path A). - 8
ukamf-cert-rotation.test.tsfailures from@xmldom/xmldom0.9 removing theerrorHandlerobject constructor option in favour ofonError(level, msg)— cleared by PR #153 (Phase 16 Path A). - 2
env-check.test.tsfailures introduced when PR #146 (16.11) added prod-mode price-ID validation without matching test fixture updates — cleared by PR #153 (beforeEachenv cleanup + regex tighten).
Single command at the repo root:
npm run verify # lint + typecheck + test:ci + buildCoverage of verify:
- ESLint: 0 errors required, 170+ baseline warnings tolerated
(mostly
@typescript-eslint/no-non-null-assertionand@typescript-eslint/consistent-type-imports) - TypeScript: strict, no errors
- Vitest: server ~750+ tests, client ~80 tests; both must be 0-failed
- Build: client (Vite + tsc) + server (tsc)
For Prisma changes also run:
DATABASE_URL='postgresql://x:x@x/x' npx prisma validate
DATABASE_URL='postgresql://x:x@x/x' npx prisma format --checkCI's prisma-validate job additionally applies migrations to a throwaway
Postgres + runs a drift check, so any local-only schema edit will fail CI
without a matching migration.
Phase 18 — production readiness is in flight. Stages 18.1–18.5
BUILT LOCALLY in one commit (8f4413d on
claude/p18-production-readiness) but NOT pushed — the remote
returns 403 at both the git proxy and the GitHub MCP, looks like a
post-Phase-17 repo transfer revoked the write tokens.
Canonical Phase 18 record: progress/phase-18-progress.md.
Handover prompt for the next session:
progress/phase-18-handover-prompt.md. Read both before
picking up.
Phase 17 (rebrand + feature completion) is closed — 11 PRs
merged to master. Canonical record:
progress/phase-17-progress.md. Phase 16 (enterprise delivery) is
also closed — progress/phase-16-summary.md.
git push origin claude/p18-production-readiness returns 403
"Permission denied to RJK134". MCP create_branch returns
"Resource not accessible by integration". The session's
allowed-repo list still points at rjk134/herm-platform, but PR
#180's metadata showed the repo moved to
Future-Horizons-Education/herm-platform — likely a transfer
that revoked the write token under the old org name.
Resolution path (any one works):
- Restore the proxy's push token under
RJK134/herm-platform. - Update both the session's
Allowed repositories:AND the git remote URL tofuture-horizons-education/herm-platform. - Operator pushes the local branch manually.
Local state is intact: commit 8f4413d, 13 files, +900/-130
lines, all tests green, coverage gate passes.
- 18.1 Code-split via
React.lazyfor 32 of 39 client routes. Bundle 478 KB → 149 KB gzipped main (-67%). Ceiling tightened 480 → 180 KB. - 18.2 Slow-query Prisma harness (
lib/slow-query-logger.ts) with redacted-args structured logging. 6 unit tests. - 18.3 Procurement-engine jurisdictions handlers extracted to
procurement-engine/jurisdictions.controller.ts; tenant-checked handlers kept in parent for #180's auth-critical scope. - 18.4 DB-backed integration test for usage-counter reconciliation. Auto-skip locally; runs in CI against the postgres service.
- 18.5 v8 coverage gate, ratchet floors (60/55/60/50 lines/functions/statements/branches) wired into CI.
The three Phase 12 items the operator unblocked decisions for:
- 12.1 OpenTelemetry tracing — destination is Grafana Tempo.
- 12.3 SLO burn-rate alerts — SLOs as proposed in
PRODUCTION_READINESS.md(99.5% availability, p95 < 500ms read, p95 < 1500ms AI chat). - 12.7 AI cost guardrail — £20 per institution per month (Pro and Enterprise default).
Plus 12.4 synthetic uptime probe, GHA cron for reconcile:usage,
full procurement-engine decomposition, designer-final logo SVG.
- Stripe live cutover — follow
docs/STRIPE_GO_LIVE.md(PR #149). Provision live products, set Vercel envs (incl.STRIPE_PRICE_INST_PRO,STRIPE_PRICE_INST_ENT,STRIPE_PRICE_INST_PRO_MONTHLY,STRIPE_PRICE_INST_ENT_MONTHLY), configure live webhook with 7 event types, smoke test with £1 charge + refund. - Pre-deploy env check — PR #146 (16.11) is merged, so the
next prod build throws at
checkEnvironment()if the Stripe price-ID envs are missing. Verify Vercel envs before redeploying.
- Phase 11 (privacy + SSO production hardening) closed. Deferred: domain-normalization case-insensitivity, S3 composable session middleware.
- Phase 14 (procurement workflow) closed.
- Phase 15 (tier rename + quota foundation) closed.
- Phase 16 (enterprise delivery) closed — see
progress/phase-16-summary.md. - Phase 17 (rebrand + feature completion) closed — see
progress/phase-17-progress.md. 11 PRs merged: #168 team page, #169 monthly billing, #170 usage reconciliation script, #171 AdminSso → TanStack, #172 drop JWT shim, #173 AdminBranding → TanStack, #174/api/tiers/limits, #175 progress doc, #176 annual £ savings, #177 tier-mapping invariant test, #179 Future Horizons rebrand + SSRF logo embed.
| File | Use when |
|---|---|
README.md |
Onboarding overview |
AGENTS.md |
Cursor / agent-runner setup commands |
CONTRIBUTING.md |
Contribution workflow basics |
RUNBOOK.md |
Day-two operations, env-var matrix, metrics, health checks |
PRODUCTION_READINESS.md |
Operational-readiness gaps + SLO proposals |
HERM_COMPLIANCE.md |
Licence invariants you must preserve |
AI_GOVERNANCE.md |
AI-call audit + cost posture |
ARCHITECTURE_NOTES.md |
High-level design notes |
PROCUREMENT_WORKFLOW.md |
Domain-model overview for procurement surfaces |
DEMO.md |
Walkthrough script for product demos |
docs/adr/0001-sso-architecture.md |
SSO invariants — required reading before SSO surface work |
docs/STRIPE_GO_LIVE.md |
Stripe live-mode cutover runbook (operator-driven) |
progress/phase-18-progress.md |
Required reading to resume the current Phase 18 build (in-flight, push blocked) |
progress/phase-18-handover-prompt.md |
Paste-target for a fresh Claude Code session to pick up Phase 18 |
progress/phase-17-progress.md |
Phase 17 rebrand + feature-completion record (closed, 11 PRs merged) |
progress/phase-16-summary.md |
Phase 16 enterprise-delivery record (closed) |
progress/phase-*.md |
Per-phase planning docs and retrospectives |