meridianhub · bjcoombs · Apr 6, 2026 · Apr 6, 2026 · Apr 6, 2026 · Apr 6, 2026
diff --git a/docs/prd/060-per-tenant-scheduled-execution.md b/docs/prd/060-per-tenant-scheduled-execution.md
@@ -0,0 +1,398 @@
+---
+name: prd-per-tenant-scheduled-execution
+description: >
+  Phased architecture for per-tenant scheduling identity: Actor struct
+  for audit attribution, manifest-driven scheduling via tenant_schedule
+  DB table, and deferred authenticated system identity with JWT.
+triggers:
+  - Working on scheduler, CronScheduler, or ScheduleProvider
+  - Adding scheduled triggers to manifests
+  - Implementing system user or service account identity
+  - Working on audit attribution for background jobs
+  - Investigating changed_by = "system" in audit trails
+  - Adding concurrency limits or tenant status checks to schedulers
+instructions: |
+  Key design decisions:
+  - SystemActorContextKey MUST be separate from UserIDContextKey
+    (verified auth bypass risk in identity service endpoints)
+  - Actor struct with Authenticated boolean prevents trust escalation
+  - changed_by format: system:scheduler:{service} (no tenant ID)
+  - Database-backed tenant_schedule table, not ManifestScheduleProvider
+  - Deliverable A (attribution) ships standalone for existing schedulers
+  - Deliverable C (JWT auth) deferred until cross-service calls needed
+---
+
+# PRD-060: Per-Tenant Scheduled Execution Architecture
+
+## Status: Draft
+
+## Problem Statement
+
+Meridian's scheduler infrastructure has three gaps that compound as the platform scales:
+
+1. **No execution identity.** All scheduled work (billing,
+   forecasting, reconciliation) runs as bare `context.Background()`
+   with implicit god-mode database access. Every entity mutation
+   records `changed_by = "system"` - indistinguishable across billing
+   runs, catch-up replays, background workers, migrations, and any
+   unauthenticated code path. This fails SOC 2 CC6.1 (logical access
+   controls) and ISO 27001 A.5.16 (identity management).
+
+2. **Manifest schedule gap.** The manifest proto declares
+   `scheduled:` triggers (`manifest.proto:289-298`) and validates
+   them (uniqueness checks, prefix parsing), but has no cron
+   expression field and no bridge to the `CronScheduler`
+   infrastructure. Validation passes but no schedule is registered
+   - the trigger is syntactically accepted but operationally inert.
+   The MCP server documentation (`reference.go:212`) contradicts
+   the manifest examples - documenting `scheduled:<cron-expression>`
+   while manifests use `scheduled:<name>`.
+
+3. **Missing scaling guardrails.** `executeJob()` spawns unbounded
+   goroutines via `lifecycle.ExecuteGuarded()` with no concurrency
+   semaphore. Tenant suspension status is never checked before
+   execution. No minimum cron interval is enforced. At the current
+   scale (~3 schedules), these are invisible. At N tenants x M
+   schedule types, they become resource exhaustion and data integrity
+   risks.
+
+## Background
+
+### Current Architecture
+
+The `shared/platform/scheduler` package provides a `CronScheduler` with:
+
+- `ScheduleProvider` interface returning all schedules across all tenants
+- `Schedule` struct carrying `TenantID` for per-tenant schema routing
+- Redis-based distributed locking (`shared/platform/redislock`) preventing duplicate execution across replicas
+- `ExecutionStore` for audit trail persistence
+- Catch-up logic for missed windows on startup
+
+Three services consume it:
+
+| Service | Provider | Schedule Source | Multi-tenant? |
+|---------|----------|----------------|---------------|
+| payment-order | `BillingScheduleProvider` | Static env var | Single tenant per config |
+| forecasting | `ForecastScheduleProvider` | `forecasting_strategy` DB table | Yes - per-tenant, per-strategy |
+| reconciliation | `SettlementScheduleProvider` | Reference Data gRPC (stub) | Yes (planned) |
+
+The forecasting service is the **existence proof** - it already does
+dynamic per-tenant scheduling from a database table with per-tenant
+`tenant_id` and per-strategy cron expressions.
+
+### Identity Gap
+
+The scheduler creates `context.Background()` at `cron.go:296` and
+injects only tenant context for schema routing. No `UserIDContextKey`
+is set. The audit system (`shared/platform/audit`) falls back to
+`DefaultAuditUser = "system"` (`audit/context.go:11-13`).
+
+The identity service uses `auth.GetUserIDFromContext(ctx)` as an
+authentication gate in 5+ endpoints
+(`grpc_identity_endpoints.go:129`, `grpc_role_endpoints.go:147`,
+etc.). Any value in `UserIDContextKey` - including an attributed
+string - passes these gates. This means **attributed identity MUST
+NOT be injected into `UserIDContextKey`**.
+
+### Existing Scaffolding
+
+- A `"service"` RBAC role is defined
+  (`shared/platform/auth/rbac.go:32`) with account/position/
+  transaction permissions but is assigned to no identity -
+  forward-looking scaffolding for system actors.
+- OAuth2 client credentials exist for service-to-service token
+  exchange (`shared/platform/auth/service_auth.go`) but are not used
+  by the scheduler.
+
+## Solution: Phased Approach
+
+The original question - "per-tenant system user with auth token" - is
+the right destination but wrong starting point. The dependency chain
+is: attribution before authentication, scheduler hardening before
+manifest bridge.
+
+### Design Principle: Attributed Identity vs Authenticated Identity
+
+**Attributed identity** is a structured string in context that appears
+in audit trails. It answers "who did this?" without cryptographic
+proof. Sufficient for SOC 2 Type I and most Type II audits with
+compensating controls.
+
+**Authenticated identity** is a verified principal (JWT) that passes
+through the auth interceptor chain. It answers "who did this AND were
+they authorized?" Required when scheduled work makes cross-service
+authenticated gRPC calls.
+
+Phase A provides attributed identity. Phase C provides authenticated
+identity. The `Actor` struct is designed to support both without data
+migration.
+
+### Critical Design Decision: Separate Context Keys
+
+Attributed system actor identity MUST use a separate
+`SystemActorContextKey`, NOT the existing `UserIDContextKey`. This is
+non-negotiable based on verified code evidence:
+
+- `GetUserIDFromContext` is used as an auth gate in 5+ identity service endpoints
+- A JWT `sub` claim could theoretically contain `system:scheduler:*` strings, creating namespace collision
+- Separate context keys create a clean boundary: JWT path populates
+  `UserIDContextKey`, scheduler path populates
+  `SystemActorContextKey`, they can never collide
+
+### The `Actor` Struct
+
+A single typed struct replaces context key proliferation:
+
+```go
+type Actor struct {
+    ID            string    // "system:scheduler:billing" or user UUID
+    Type          ActorType // Human, Scheduler, Worker, Migration
+    Authenticated bool      // true only if set by auth interceptor
+    Source        string    // "grpc-interceptor", "cron-scheduler", "catch-up"
+}
+```
+
+- The gRPC interceptor sets `Actor{ID: userID, Type: Human,
+  Authenticated: true, Source: "grpc-interceptor"}`
+- The scheduler sets `Actor{ID: "system:scheduler:billing",
+  Type: Scheduler, Authenticated: false, Source: "cron-scheduler"}`
+- Audit hooks read `actor.ID` for `changed_by` regardless of type
+- Auth gates check `actor.Authenticated` - attributed strings never
+  pass auth checks
+- Future actor types (workers, migrations, webhooks) extend via
+  `ActorType` without new context keys
+
+### `changed_by` Format
+
+`system:scheduler:{service}` - no tenant ID. The tenant is implicit
+in the schema-scoped audit trail. Including tenant ID is redundant,
+creates privacy leakage in cross-tenant audit views, and renders
+poorly in the UI.
+
+## Deliverables
+
+### Deliverable A: Scheduler Hardening + Attribution
+
+**Scope:** Standalone value for the 3 existing schedulers. No manifest changes, no proto changes, no API surface changes.
+
+**Estimated complexity:** 5 story points
+
+#### A.1: `Actor` Struct and Context Key
+
+Create `shared/platform/auth/actor.go`:
+
+- `Actor` struct with `ID`, `Type`, `Authenticated`, `Source` fields
+- `ActorType` enum: `Human`, `Scheduler`, `Worker`, `Migration`
+- `ActorContextKey` context key
+- `WithActor(ctx, actor)` and `ActorFromContext(ctx)` helpers
+- Update `audit.GetUserFromContext()` to check `ActorContextKey` first, then `UserIDContextKey`, then fall back to `DefaultAuditUser`
+
+#### A.2: Scheduler Attribution
+
+In `shared/platform/scheduler/cron.go`, `executeJob()`:
+
+- Inject `Actor{ID: "system:scheduler:{schedulerName}", Type: Scheduler,
+  Authenticated: false, Source: "cron-scheduler"}` into context
+- Inject `audit.WithCorrelationID(ctx, execID.String())` to link all audit records from one execution
+- For catch-up executions, use `Source: "catch-up"`
+
+#### A.3: Tenant Status Check
+
+In `executeJob()`, before calling the executor:
+
+- Query tenant status (active/suspended/deprovisioned)
+- Skip execution for non-active tenants, record as `SKIPPED` with reason
+- Known limitation: tenant can be suspended mid-execution. Saga-level handling is a future concern.
+
+#### A.4: Concurrency Limiter
+
+In `executeJob()` or `CronScheduler`:
+
+- Add a configurable semaphore (default max 20 concurrent executions)
+- Excess executions are `SKIPPED` with reason "concurrency limit reached"
+- Prevents DB connection pool exhaustion when schedules align (e.g., `0 0 1 * *` for all tenants)
+
+#### A.5: Refresh Jitter
+
+In `refreshSchedules()`:
+
+- Add random jitter (0-10s) to the refresh ticker interval
+- Prevents synchronized `ListSchedules()` bursts when multiple replicas start simultaneously
+
+#### A.6: ADR
+
+Document:
+
+- Attribution vs authentication decision and rationale
+- `SystemActorContextKey` separate from `UserIDContextKey` rationale (with code evidence)
+- `Actor` struct design and forward-compatibility with Phase C
+- Known limitations (attributed strings are not cryptographically verified)
+
+### Deliverable B: Manifest-Driven Scheduling
+
+**Scope:** Bridges manifest `scheduled:` trigger declarations to the
+`CronScheduler`. Uses database-backed schedule storage (proven by
+forecasting pattern).
+
+**Estimated complexity:** 8 story points
+
+**Prerequisite:** Deliverable A (attribution must be in place before scaling schedule count)
+
+#### B.1: `tenant_schedule` Database Table
+
+Per-tenant-schema table written by manifest application, read by `ScheduleProvider`:
+
+```sql
+CREATE TABLE tenant_schedule (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    schedule_name VARCHAR(128) NOT NULL,
+    saga_name VARCHAR(128) NOT NULL,
+    cron_expr VARCHAR(64) NOT NULL,
+    enabled BOOLEAN NOT NULL DEFAULT true,
+    manifest_version_id UUID,
+    metadata JSONB,
+    created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+    updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+    UNIQUE(schedule_name)
+);
+```
+
+`manifest_version_id` provides traceability: which manifest version
+created this schedule. This references control-plane manifest versions
+(cross-schema, no FK constraint) - a soft reference for audit/debugging
+purposes, not a hard database relationship.
+
+#### B.2: Manifest Application Pipeline
+
+During `ApplyManifest`, for `scheduled:` triggers:
+
+- Parse schedule configuration from manifest
+- Translate friendly abstractions to cron expressions (if applicable)
+- Diff declared schedules against existing `tenant_schedule` rows
+- Insert/update/delete schedule rows
+- Return registered schedules with `next_execution` time in the apply response
+
+#### B.3: Unified `ScheduleProvider`
+
+A `TenantScheduleProvider` that queries `tenant_schedule` across all
+tenant schemas. Replaces or supplements per-service providers:
+
+- Billing: migrate from env var to `tenant_schedule`
+- Reconciliation: implement against `tenant_schedule` instead of Reference Data stub
+- Forecasting: can adopt `tenant_schedule` or keep `forecasting_strategy` (per-service decision)
+
+#### B.4: Manifest Schedule Validation
+
+Enforced at manifest validation time:
+
+- Minimum cron interval: 15 minutes
+- Maximum schedules per tenant manifest: 10-20 (configurable)
+- Reject syntactically valid but semantically nonsensical expressions (e.g., `0 0 31 2 *`)
+- Warn on very infrequent schedules (annually) as likely bugs
+
+#### B.5: Per-Tenant Execution Limits
+
+Runtime guardrails in `executeJob()`:
+
+- Per-tenant concurrent execution limit (3-5, configurable)
+- Excess executions recorded as `SKIPPED` with tenant-specific reason
+- Distinct from the global semaphore in A.4
+
+#### B.6: Schedule Health Monitoring
+
+Observability (can ship in parallel):
+
+- Execution latency histogram per scheduler/tenant
+- Lock contention metrics
+- Expected-vs-actual execution frequency check (alert when a schedule hasn't fired in 2x its expected interval)
+- Redis health metric
+
+#### B.7: Manifest DX Design Spike
+
+Design decision (before B.2 implementation):
+
+- Raw cron expressions only? (`schedule: "0 2 1 * *"`)
+- Friendly abstractions? (`schedule: { every: "1h" }`,
+  `schedule: { monthly: { day: 1, hour: 2 } }`)
+- Named presets? (`schedule: "monthly_billing"` mapping to a reference
+  data entry)
+- Resolve MCP docs contradiction (`scheduled:<cron>` vs `scheduled:<name>`)
+
+The `tenant_schedule` table decouples this decision from the scheduler -
+manifest DX can evolve independently because the translation to cron
+expressions happens at the application layer.
+
+### Deliverable C: Authenticated System Identity
+
+**Scope:** Per-tenant system user with JWT-scoped execution. Full auth
+chain for scheduled work.
+
+**Estimated complexity:** 13 story points
+
+**Trigger:** Required when scheduled sagas need to make cross-service
+authenticated gRPC calls, OR when a customer requires SOC 2 Type II
+with cryptographically verifiable chain of custody.
+
+#### C.1: Per-Tenant System User
+
+- Created during tenant provisioning as a post-provisioning hook
+- Assigned the existing `"service"` RBAC role
+- Per-service role scoping if needed (billing doesn't need the same permissions as forecasting)
+- Lifecycle: created on provision, suspended on tenant suspend, deprovisioned on tenant deprovision
+
+#### C.2: Per-Execution Token Minting
+
+- Scheduler mints short-lived JWTs per execution (not long-lived cached credentials)
+- Token lifetime = `ExecutionTimeout` + buffer (e.g., 15 minutes)
+- Minting is in-process (no external call that can fail)
+- Token carries tenant ID, service role, and execution correlation ID
+
+#### C.3: Token-Scoped Saga Execution
+
+- Executor injects JWT into context before saga execution
+- Saga steps that make gRPC calls carry the token through interceptors
+- Token-expiry-mid-saga handling: design upfront (fail + compensate, or refresh mid-saga)
+
+#### C.4: Credential Lifecycle
+
+- No long-lived credentials to rotate (per-execution minting)
+- System user suspension on tenant deactivation
+- Monitoring: alert on system user token mint failures
+
+## Out of Scope
+
+- Removing `public.platform_saga_definition` table (control-plane uses it for `apply_manifest`)
+- Changing the forecasting service's `forecasting_strategy` table (can adopt `tenant_schedule` or keep its own)
+- Tenant-to-tenant data sharing / mesh scheduling (future architecture)
+- Schedule-triggered notifications to tenants (future DX feature)
+
+## Verification
+
+### Deliverable A
+
+1. Existing scheduler tests pass with `Actor` context injection
+2. `changed_by` fields show `system:scheduler:{service}` instead of `"system"`
+3. Audit records carry `correlation_id` linking to `scheduler_execution.id`
+4. Suspended tenant's schedules are skipped with audit trail
+5. Concurrent execution capped at semaphore limit
+
+### Deliverable B
+
+1. Manifest with `scheduled:` trigger creates `tenant_schedule` row
+2. Schedule appears in `CronScheduler` within 60s refresh interval
+3. Apply response includes registered schedules with next execution time
+4. Cron expressions below 15-min floor are rejected at validation
+5. Per-tenant schedule count exceeding cap is rejected
+
+### Deliverable C
+
+1. Scheduled saga steps can make authenticated gRPC calls to other services
+2. Token carries correct tenant ID and service role
+3. Token expiry is handled gracefully (saga fails cleanly, not silently)
+
+## References
+
+- Six Thinking Hats analysis: 5-person panel (security, distributed systems, SRE, product, compliance)
+- Key code paths: `shared/platform/scheduler/cron.go`, `shared/platform/auth/`, `shared/platform/audit/`
+- Existing patterns: forecasting `StrategyRepository.ListAllActive()` (DB-backed schedule provider)
+- Unused scaffolding: `"service"` RBAC role (`shared/platform/auth/rbac.go:32`)