Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
398 changes: 398 additions & 0 deletions docs/prd/060-per-tenant-scheduled-execution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,398 @@
---
name: prd-per-tenant-scheduled-execution
description: >
Phased architecture for per-tenant scheduling identity: Actor struct
for audit attribution, manifest-driven scheduling via tenant_schedule
DB table, and deferred authenticated system identity with JWT.
triggers:
- Working on scheduler, CronScheduler, or ScheduleProvider
- Adding scheduled triggers to manifests
- Implementing system user or service account identity
- Working on audit attribution for background jobs
- Investigating changed_by = "system" in audit trails
- Adding concurrency limits or tenant status checks to schedulers
instructions: |
Key design decisions:
- SystemActorContextKey MUST be separate from UserIDContextKey
(verified auth bypass risk in identity service endpoints)
- Actor struct with Authenticated boolean prevents trust escalation
- changed_by format: system:scheduler:{service} (no tenant ID)
- Database-backed tenant_schedule table, not ManifestScheduleProvider
- Deliverable A (attribution) ships standalone for existing schedulers
- Deliverable C (JWT auth) deferred until cross-service calls needed
---

# PRD-060: Per-Tenant Scheduled Execution Architecture

## Status: Draft

## Problem Statement

Meridian's scheduler infrastructure has three gaps that compound as the platform scales:

1. **No execution identity.** All scheduled work (billing,
forecasting, reconciliation) runs as bare `context.Background()`
with implicit god-mode database access. Every entity mutation
records `changed_by = "system"` - indistinguishable across billing
runs, catch-up replays, background workers, migrations, and any
unauthenticated code path. This fails SOC 2 CC6.1 (logical access
controls) and ISO 27001 A.5.16 (identity management).

2. **Manifest schedule gap.** The manifest proto declares
`scheduled:` triggers (`manifest.proto:289-298`) and validates
them (uniqueness checks, prefix parsing), but has no cron
expression field and no bridge to the `CronScheduler`
infrastructure. Validation passes but no schedule is registered
- the trigger is syntactically accepted but operationally inert.
The MCP server documentation (`reference.go:212`) contradicts
the manifest examples - documenting `scheduled:<cron-expression>`
while manifests use `scheduled:<name>`.

3. **Missing scaling guardrails.** `executeJob()` spawns unbounded
goroutines via `lifecycle.ExecuteGuarded()` with no concurrency
semaphore. Tenant suspension status is never checked before
execution. No minimum cron interval is enforced. At the current
scale (~3 schedules), these are invisible. At N tenants x M
schedule types, they become resource exhaustion and data integrity
risks.

## Background

### Current Architecture

The `shared/platform/scheduler` package provides a `CronScheduler` with:

- `ScheduleProvider` interface returning all schedules across all tenants
- `Schedule` struct carrying `TenantID` for per-tenant schema routing
- Redis-based distributed locking (`shared/platform/redislock`) preventing duplicate execution across replicas
- `ExecutionStore` for audit trail persistence
- Catch-up logic for missed windows on startup

Three services consume it:

| Service | Provider | Schedule Source | Multi-tenant? |
|---------|----------|----------------|---------------|
| payment-order | `BillingScheduleProvider` | Static env var | Single tenant per config |
| forecasting | `ForecastScheduleProvider` | `forecasting_strategy` DB table | Yes - per-tenant, per-strategy |
| reconciliation | `SettlementScheduleProvider` | Reference Data gRPC (stub) | Yes (planned) |

The forecasting service is the **existence proof** - it already does
dynamic per-tenant scheduling from a database table with per-tenant
`tenant_id` and per-strategy cron expressions.

### Identity Gap

The scheduler creates `context.Background()` at `cron.go:296` and
injects only tenant context for schema routing. No `UserIDContextKey`
is set. The audit system (`shared/platform/audit`) falls back to
`DefaultAuditUser = "system"` (`audit/context.go:11-13`).

The identity service uses `auth.GetUserIDFromContext(ctx)` as an
authentication gate in 5+ endpoints
(`grpc_identity_endpoints.go:129`, `grpc_role_endpoints.go:147`,
etc.). Any value in `UserIDContextKey` - including an attributed
string - passes these gates. This means **attributed identity MUST
NOT be injected into `UserIDContextKey`**.

### Existing Scaffolding

- A `"service"` RBAC role is defined
(`shared/platform/auth/rbac.go:32`) with account/position/
transaction permissions but is assigned to no identity -
forward-looking scaffolding for system actors.
- OAuth2 client credentials exist for service-to-service token
exchange (`shared/platform/auth/service_auth.go`) but are not used
by the scheduler.

## Solution: Phased Approach

The original question - "per-tenant system user with auth token" - is
the right destination but wrong starting point. The dependency chain
is: attribution before authentication, scheduler hardening before
manifest bridge.

### Design Principle: Attributed Identity vs Authenticated Identity

**Attributed identity** is a structured string in context that appears
in audit trails. It answers "who did this?" without cryptographic
proof. Sufficient for SOC 2 Type I and most Type II audits with
compensating controls.

**Authenticated identity** is a verified principal (JWT) that passes
through the auth interceptor chain. It answers "who did this AND were
they authorized?" Required when scheduled work makes cross-service
authenticated gRPC calls.

Phase A provides attributed identity. Phase C provides authenticated
identity. The `Actor` struct is designed to support both without data
migration.

### Critical Design Decision: Separate Context Keys

Attributed system actor identity MUST use a separate
`SystemActorContextKey`, NOT the existing `UserIDContextKey`. This is
non-negotiable based on verified code evidence:

- `GetUserIDFromContext` is used as an auth gate in 5+ identity service endpoints
- A JWT `sub` claim could theoretically contain `system:scheduler:*` strings, creating namespace collision
- Separate context keys create a clean boundary: JWT path populates
`UserIDContextKey`, scheduler path populates
`SystemActorContextKey`, they can never collide

### The `Actor` Struct

A single typed struct replaces context key proliferation:

```go
type Actor struct {
ID string // "system:scheduler:billing" or user UUID
Type ActorType // Human, Scheduler, Worker, Migration
Authenticated bool // true only if set by auth interceptor
Source string // "grpc-interceptor", "cron-scheduler", "catch-up"
}
```

- The gRPC interceptor sets `Actor{ID: userID, Type: Human,
Authenticated: true, Source: "grpc-interceptor"}`
- The scheduler sets `Actor{ID: "system:scheduler:billing",
Type: Scheduler, Authenticated: false, Source: "cron-scheduler"}`
- Audit hooks read `actor.ID` for `changed_by` regardless of type
- Auth gates check `actor.Authenticated` - attributed strings never
pass auth checks
- Future actor types (workers, migrations, webhooks) extend via
`ActorType` without new context keys

### `changed_by` Format

`system:scheduler:{service}` - no tenant ID. The tenant is implicit
in the schema-scoped audit trail. Including tenant ID is redundant,
creates privacy leakage in cross-tenant audit views, and renders
poorly in the UI.

## Deliverables

### Deliverable A: Scheduler Hardening + Attribution

**Scope:** Standalone value for the 3 existing schedulers. No manifest changes, no proto changes, no API surface changes.

**Estimated complexity:** 5 story points

#### A.1: `Actor` Struct and Context Key

Create `shared/platform/auth/actor.go`:

- `Actor` struct with `ID`, `Type`, `Authenticated`, `Source` fields
- `ActorType` enum: `Human`, `Scheduler`, `Worker`, `Migration`
- `ActorContextKey` context key
- `WithActor(ctx, actor)` and `ActorFromContext(ctx)` helpers
- Update `audit.GetUserFromContext()` to check `ActorContextKey` first, then `UserIDContextKey`, then fall back to `DefaultAuditUser`

#### A.2: Scheduler Attribution

In `shared/platform/scheduler/cron.go`, `executeJob()`:

- Inject `Actor{ID: "system:scheduler:{schedulerName}", Type: Scheduler,
Authenticated: false, Source: "cron-scheduler"}` into context
- Inject `audit.WithCorrelationID(ctx, execID.String())` to link all audit records from one execution
- For catch-up executions, use `Source: "catch-up"`

#### A.3: Tenant Status Check

In `executeJob()`, before calling the executor:

- Query tenant status (active/suspended/deprovisioned)
- Skip execution for non-active tenants, record as `SKIPPED` with reason
- Known limitation: tenant can be suspended mid-execution. Saga-level handling is a future concern.

#### A.4: Concurrency Limiter

In `executeJob()` or `CronScheduler`:

- Add a configurable semaphore (default max 20 concurrent executions)
- Excess executions are `SKIPPED` with reason "concurrency limit reached"
- Prevents DB connection pool exhaustion when schedules align (e.g., `0 0 1 * *` for all tenants)

#### A.5: Refresh Jitter

In `refreshSchedules()`:

- Add random jitter (0-10s) to the refresh ticker interval
- Prevents synchronized `ListSchedules()` bursts when multiple replicas start simultaneously

#### A.6: ADR

Document:

- Attribution vs authentication decision and rationale
- `SystemActorContextKey` separate from `UserIDContextKey` rationale (with code evidence)
- `Actor` struct design and forward-compatibility with Phase C
- Known limitations (attributed strings are not cryptographically verified)

### Deliverable B: Manifest-Driven Scheduling

**Scope:** Bridges manifest `scheduled:` trigger declarations to the
`CronScheduler`. Uses database-backed schedule storage (proven by
forecasting pattern).

**Estimated complexity:** 8 story points

**Prerequisite:** Deliverable A (attribution must be in place before scaling schedule count)

#### B.1: `tenant_schedule` Database Table

Per-tenant-schema table written by manifest application, read by `ScheduleProvider`:

```sql
CREATE TABLE tenant_schedule (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
schedule_name VARCHAR(128) NOT NULL,
saga_name VARCHAR(128) NOT NULL,
cron_expr VARCHAR(64) NOT NULL,
enabled BOOLEAN NOT NULL DEFAULT true,
manifest_version_id UUID,
Comment thread
coderabbitai[bot] marked this conversation as resolved.
metadata JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE(schedule_name)
);
```

`manifest_version_id` provides traceability: which manifest version
created this schedule. This references control-plane manifest versions
(cross-schema, no FK constraint) - a soft reference for audit/debugging
purposes, not a hard database relationship.

#### B.2: Manifest Application Pipeline

During `ApplyManifest`, for `scheduled:` triggers:

- Parse schedule configuration from manifest
- Translate friendly abstractions to cron expressions (if applicable)
- Diff declared schedules against existing `tenant_schedule` rows
- Insert/update/delete schedule rows
- Return registered schedules with `next_execution` time in the apply response

#### B.3: Unified `ScheduleProvider`

A `TenantScheduleProvider` that queries `tenant_schedule` across all
tenant schemas. Replaces or supplements per-service providers:

- Billing: migrate from env var to `tenant_schedule`
- Reconciliation: implement against `tenant_schedule` instead of Reference Data stub
- Forecasting: can adopt `tenant_schedule` or keep `forecasting_strategy` (per-service decision)

#### B.4: Manifest Schedule Validation

Enforced at manifest validation time:

- Minimum cron interval: 15 minutes
- Maximum schedules per tenant manifest: 10-20 (configurable)
- Reject syntactically valid but semantically nonsensical expressions (e.g., `0 0 31 2 *`)
- Warn on very infrequent schedules (annually) as likely bugs

#### B.5: Per-Tenant Execution Limits

Runtime guardrails in `executeJob()`:

- Per-tenant concurrent execution limit (3-5, configurable)
- Excess executions recorded as `SKIPPED` with tenant-specific reason
- Distinct from the global semaphore in A.4

#### B.6: Schedule Health Monitoring

Observability (can ship in parallel):

- Execution latency histogram per scheduler/tenant
- Lock contention metrics
- Expected-vs-actual execution frequency check (alert when a schedule hasn't fired in 2x its expected interval)
- Redis health metric

#### B.7: Manifest DX Design Spike

Design decision (before B.2 implementation):

- Raw cron expressions only? (`schedule: "0 2 1 * *"`)
- Friendly abstractions? (`schedule: { every: "1h" }`,
`schedule: { monthly: { day: 1, hour: 2 } }`)
- Named presets? (`schedule: "monthly_billing"` mapping to a reference
data entry)
- Resolve MCP docs contradiction (`scheduled:<cron>` vs `scheduled:<name>`)

The `tenant_schedule` table decouples this decision from the scheduler -
manifest DX can evolve independently because the translation to cron
expressions happens at the application layer.

### Deliverable C: Authenticated System Identity

**Scope:** Per-tenant system user with JWT-scoped execution. Full auth
chain for scheduled work.

**Estimated complexity:** 13 story points

**Trigger:** Required when scheduled sagas need to make cross-service
authenticated gRPC calls, OR when a customer requires SOC 2 Type II
with cryptographically verifiable chain of custody.

#### C.1: Per-Tenant System User

- Created during tenant provisioning as a post-provisioning hook
- Assigned the existing `"service"` RBAC role
- Per-service role scoping if needed (billing doesn't need the same permissions as forecasting)
- Lifecycle: created on provision, suspended on tenant suspend, deprovisioned on tenant deprovision

#### C.2: Per-Execution Token Minting

- Scheduler mints short-lived JWTs per execution (not long-lived cached credentials)
- Token lifetime = `ExecutionTimeout` + buffer (e.g., 15 minutes)
- Minting is in-process (no external call that can fail)
- Token carries tenant ID, service role, and execution correlation ID

#### C.3: Token-Scoped Saga Execution

- Executor injects JWT into context before saga execution
- Saga steps that make gRPC calls carry the token through interceptors
- Token-expiry-mid-saga handling: design upfront (fail + compensate, or refresh mid-saga)

#### C.4: Credential Lifecycle

- No long-lived credentials to rotate (per-execution minting)
- System user suspension on tenant deactivation
- Monitoring: alert on system user token mint failures

## Out of Scope

- Removing `public.platform_saga_definition` table (control-plane uses it for `apply_manifest`)
- Changing the forecasting service's `forecasting_strategy` table (can adopt `tenant_schedule` or keep its own)
- Tenant-to-tenant data sharing / mesh scheduling (future architecture)
- Schedule-triggered notifications to tenants (future DX feature)

## Verification

### Deliverable A

1. Existing scheduler tests pass with `Actor` context injection
2. `changed_by` fields show `system:scheduler:{service}` instead of `"system"`
3. Audit records carry `correlation_id` linking to `scheduler_execution.id`
4. Suspended tenant's schedules are skipped with audit trail
5. Concurrent execution capped at semaphore limit

### Deliverable B

1. Manifest with `scheduled:` trigger creates `tenant_schedule` row
2. Schedule appears in `CronScheduler` within 60s refresh interval
3. Apply response includes registered schedules with next execution time
4. Cron expressions below 15-min floor are rejected at validation
5. Per-tenant schedule count exceeding cap is rejected

### Deliverable C

1. Scheduled saga steps can make authenticated gRPC calls to other services
2. Token carries correct tenant ID and service role
3. Token expiry is handled gracefully (saga fails cleanly, not silently)

## References

- Six Thinking Hats analysis: 5-person panel (security, distributed systems, SRE, product, compliance)
- Key code paths: `shared/platform/scheduler/cron.go`, `shared/platform/auth/`, `shared/platform/audit/`
- Existing patterns: forecasting `StrategyRepository.ListAllActive()` (DB-backed schedule provider)
- Unused scaffolding: `"service"` RBAC role (`shared/platform/auth/rbac.go:32`)
Loading
Loading