Skip to content

Commit 9690de7

Browse files
authored
docs: tighten router phase governance and contracts (#14)
* fix: create dist dir before signed release packaging * docs: tighten router phase governance and contracts * docs: record milestone 1 closeout * docs: address PR review thread feedback --------- Co-authored-by: Hanna Rosengren <4538260+hannasoderstromdev@users.noreply.github.com>
1 parent 6fcf4a9 commit 9690de7

3 files changed

Lines changed: 819 additions & 0 deletions

File tree

docs/ROUTER-PHASE-PLAN.md

Lines changed: 352 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,352 @@
1+
# Router Phase Plan
2+
3+
## Purpose
4+
5+
The Claude Code MVP has now validated the product wedge that matters most:
6+
7+
* pre-execution routing authority exists at the launch or resume boundary
8+
* routed turns can preserve session continuity
9+
* route decisions can be explained and audited
10+
* wrapper and hook evidence can be correlated
11+
12+
The next phase should turn that validated workflow into a real router control plane.
13+
14+
The core objective is not to add more vendor integrations immediately. The core objective is to make the router boundary explicit, durable, and reusable so later integrations do not duplicate Claude-specific assumptions.
15+
16+
## Why This Phase Exists
17+
18+
The broader product definition in [PRD.md](PRD.md) describes a vendor-neutral, session-aware routing layer that owns session state, task classification, routing policy, execution-target eligibility, handoff, and routing logs.
19+
20+
The current MVP in [MVP-PRD.md](MVP-PRD.md) proves a narrower but valuable product slice: a Claude-scoped workflow integration with deterministic routing, continuity, explanation, and conservative governance.
21+
22+
That is strong progress, but the current implementation is still closer to a validated workflow product than a fully productized control plane.
23+
24+
The largest remaining gap is structural:
25+
26+
* the router contracts called for in the PRD do not yet exist as a canonical document
27+
* session state is still relatively thin compared to the intended model
28+
* routing is still mostly mode- and prompt-driven rather than session-controller-driven
29+
* the Claude workflow remains the dominant product surface rather than one consumer of a stable router boundary
30+
31+
This phase closes that gap.
32+
33+
## Phase Goal
34+
35+
Define and implement router v1 as a first-class control-plane boundary while keeping the working Claude Code workflow intact.
36+
37+
At the end of this phase, the repository should have:
38+
39+
1. A canonical router contract set.
40+
2. A richer session-aware routing policy.
41+
3. A cleaner separation between router logic and Claude workflow mechanics.
42+
4. Evidence and explainability shapes that support policy tuning.
43+
5. One additional integration proof that exercises the router boundary without forcing a full second product launch.
44+
45+
## Non-Goals
46+
47+
This phase should not attempt to:
48+
49+
* ship a rich UI
50+
* introduce learned routing
51+
* build automatic in-session model switching
52+
* broaden tool permission automation beyond current conservative policy
53+
* add many new vendors before the router boundary is stable
54+
* extract and publish `@model-switchboard/router` prematurely
55+
* redesign the Claude workflow UX unless required by the router boundary
56+
57+
## Product Thesis For This Phase
58+
59+
The MVP has already shown that the user value is not "model switching everywhere."
60+
61+
The durable product value is:
62+
63+
```text
64+
A session-aware control plane that chooses the right execution target before a turn,
65+
preserves continuity when switching is not worth it,
66+
and makes that decision inspectable.
67+
```
68+
69+
That means the next step is to strengthen session-aware decision quality and abstraction boundaries before optimizing for surface-area expansion.
70+
71+
## Milestones
72+
73+
### Milestone 1: Router Contracts
74+
75+
Status: complete (2026-05-10)
76+
77+
Decision record: see `DEC-2026-05-10-milestone-1-router-contracts-closeout` in `docs/decision-log.md`.
78+
79+
Create the missing canonical contract document at:
80+
81+
```text
82+
docs/contracts/router-contracts.md
83+
```
84+
85+
The first contract release should be explicitly provisional, versioned as experimental, and intentionally minimal so it stays revision-friendly as non-Claude evidence grows.
86+
87+
For alignment details between this phase plan and the contract document, see the `Phase Alignment Notes` section in `docs/contracts/router-contracts.md`.
88+
89+
Use a two-part structure:
90+
91+
1. normative minimal core (versioned experimental)
92+
2. vendor-specific mapping appendix (starting with Claude)
93+
94+
The minimal core should define the normative shapes for:
95+
96+
* `SessionState`
97+
* `TaskClassification`
98+
* `ExecutionTargetMetadata`
99+
* `RoutingDecision`
100+
* `ContextPackage`
101+
* `RoutingLogEvent`
102+
* `RouterConfig`
103+
104+
The vendor appendix should document how current Claude workflow evidence maps into the core contracts without implying universal behavior across all clients.
105+
106+
The minimal core must also define a handoff/context-transfer thread (initially lightweight) that can be reused by future surfaces without adopting Claude-specific semantics.
107+
108+
Minimum requirements:
109+
110+
* distinguish durable session mode from latest-turn task type
111+
* represent hard constraints separately from soft preferences
112+
* represent continuity cost explicitly
113+
* represent manual overrides without allowing them to bypass hard constraints
114+
* define persisted fields required for explanation and replay
115+
* define schema versioning expectations for stored artifacts
116+
* define minimal handoff/context-transfer fields that are stable enough for explain and replay
117+
118+
Acceptance criteria:
119+
120+
* the contracts document exists and is specific enough to implement against
121+
* the contract version is labeled experimental and provisional
122+
* the minimal normative core and Claude mapping appendix are both present
123+
* current Claude workflow evidence can be mapped into the contract shapes
124+
* target metadata is described as execution targets, not abstract models
125+
* handoff/context-transfer is represented in the core contracts, not only in vendor appendix notes
126+
127+
### Milestone 2: Session Controller And Policy Upgrade
128+
129+
Move from mostly prompt-local routing to session-aware routing.
130+
131+
Required work:
132+
133+
* add a session controller that owns mode transitions
134+
* separate task typing from resolved session mode
135+
* derive capabilities from resolved mode plus task-specific needs
136+
* add continuity-cost-aware switch decisions
137+
* add explicit escalation rules for low confidence, user corrections, repeated failures, and high-risk implementation
138+
* include privacy constraints, target availability, and client compatibility as named hard-constraint policy inputs
139+
* include user preferences and project overrides as named soft-constraint policy inputs
140+
* preserve clear refusal behavior when no target satisfies hard constraints
141+
142+
Implementation depth for new policy inputs can be staged, but each input must be represented explicitly in policy contracts and decision explanation.
143+
144+
Acceptance criteria:
145+
146+
* mode transition logic is deterministic and testable
147+
* switching is less twitchy when the current target is good enough
148+
* policy decisions are explainable in terms of session state, task type, hard constraints, soft constraints, and continuity cost
149+
* existing route labels such as `quick`, `balanced`, and `best coder` still map cleanly onto the upgraded policy
150+
151+
### Milestone 3: Claude Workflow On Router Boundary
152+
153+
Refit the existing Claude workflow so it consumes the router as a client integration rather than embedding router assumptions directly.
154+
155+
Required work:
156+
157+
* make the Claude launch and resume path consume router contract outputs
158+
* ensure route context persistence uses canonical routing-decision and session shapes
159+
* ensure Claude-side context injection and route context writes map to the core handoff/context-transfer contract
160+
* align `switchboard explain` with the contract-backed evidence model
161+
* keep current continuity semantics and fail-closed behavior intact
162+
163+
Acceptance criteria:
164+
165+
* the Claude workflow remains fully functional
166+
* the router can be exercised without depending on Claude-specific launch details
167+
* logs distinguish router decision data from Claude workflow execution data
168+
* handoff/context-transfer data in Claude flow matches core contract fields
169+
170+
### Milestone 4: Explainability And Outcome Attribution Foundation
171+
172+
Prioritize explainability and outcome attribution first, then support policy tuning on top of that foundation.
173+
174+
Required work:
175+
176+
* normalize routing log events around contract types
177+
* ensure explain output can reconstruct why a route was chosen
178+
* make outcome attribution fields explicit and consistent across routed turns
179+
* keep handoff/context-transfer evidence visible in explain and logs
180+
* preserve enough information to replay a decision offline against fixtures or captured sessions
181+
* define a minimal outcome taxonomy for future evaluation
182+
183+
Acceptance criteria:
184+
185+
* a routed turn can be inspected after the fact without reading raw implementation details
186+
* outcome attribution is queryable from contract-backed log fields
187+
* fixture or recorded-session evaluation can compare expected and actual decisions
188+
* policy changes can be tested against stored evidence without live Claude runs after explain and attribution foundations are stable
189+
190+
### Milestone 5: Second Surface Proof
191+
192+
Conditionally validate router-boundary reuse by exercising one additional integration path only after Milestones 1 through 4 are stable.
193+
194+
Preferred options:
195+
196+
1. an advisory integration that returns a recommended target class and explanation without direct execution control
197+
2. a narrow adapter-backed execution path using the existing adapter layer
198+
199+
Selection rule:
200+
201+
Choose the smallest integration that proves the router boundary without creating a second full workflow product.
202+
203+
Decision gate:
204+
205+
Milestone 5 proceeds only if earlier milestones show stable contracts, stable explainability/attribution, and no unresolved regressions in the Claude workflow.
206+
207+
Acceptance criteria:
208+
209+
* the second surface consumes router contracts rather than Claude-specific shapes
210+
* the router can make a decision for that surface without special-casing Claude semantics
211+
* the result demonstrates that the boundary is reusable, not just abstractly documented
212+
213+
If the gate is not met, Milestone 5 is deferred without blocking phase completion.
214+
215+
## Suggested Build Order
216+
217+
1. Write `docs/contracts/router-contracts.md`.
218+
2. Keep the first version experimental and revision-friendly while evidence broadens.
219+
3. Refactor router internals around session state, task type, and routing decision contracts.
220+
4. Upgrade router tests to cover mode transitions, continuity cost, escalation, refusal, and override behavior.
221+
5. Adapt the Claude workflow and explain path to the new contracts.
222+
6. Add replayable evidence fixtures.
223+
7. If decision gate is met, prove the second integration surface.
224+
225+
This order matters.
226+
227+
If the second integration is attempted before the contracts and session-policy work, the repo will likely accumulate adapter-specific duplication rather than a real control plane.
228+
229+
## Engineering Principles
230+
231+
* Keep the current Claude workflow working throughout the phase.
232+
* Prefer deterministic policy over heuristic sprawl.
233+
* Treat continuity as a first-class cost, not a side effect.
234+
* Preserve fail-closed behavior where routing trust is required.
235+
* Avoid widening the target set until the contract boundary is stable.
236+
* Build logs and explainability for replay, not only for human debugging.
237+
238+
## Early Commitments To Delay Pending Verification
239+
240+
The following commitments should remain intentionally deferred until assumptions are verified with evidence beyond the current Claude-first workflow.
241+
242+
1. Freeze of task taxonomy and mode set.
243+
Reason: current evidence is strong for a subset of software-delivery tasks but not broad enough to lock a universal taxonomy.
244+
Verification signal: repeated misclassification and override patterns are low across at least one non-Claude validation surface.
245+
Interim default: keep current mode and task model deterministic, but allow taxonomy evolution in experimental versions.
246+
247+
2. Numeric continuity-cost scoring model.
248+
Reason: weighted scoring can imply precision that current evidence does not support.
249+
Verification signal: replay evidence shows categorical continuity decisions are insufficient for stable switching behavior.
250+
Interim default: keep continuity cost categorical with transparent explanation fields.
251+
252+
3. Stable risk model ownership boundary.
253+
Reason: it is not yet proven whether risk should be router-owned, adapter-provided, or hybrid.
254+
Verification signal: at least one second surface demonstrates consistent risk signal quality and portability.
255+
Interim default: treat risk as optional and explicitly sourced in logs.
256+
257+
4. Hard enforcement semantics for privacy, availability, and compatibility.
258+
Reason: strict enforcement may over-refuse before policy inputs are reliable across clients.
259+
Verification signal: policy replay shows low false-refusal rates with named hard-constraint inputs.
260+
Interim default: represent these constraints in contracts now, stage enforcement depth by milestone.
261+
262+
5. Stable outcome attribution taxonomy.
263+
Reason: early taxonomies often calcify around one workflow and become hard to migrate.
264+
Verification signal: attribution labels remain actionable and low-ambiguity across fixture replay and live evidence.
265+
Interim default: keep outcome attribution minimal and revision-friendly.
266+
267+
6. Mandatory second-surface implementation.
268+
Reason: breadth can dilute focus if core explainability and attribution are not stable first.
269+
Verification signal: Milestones 1 through 4 meet stability gates with no unresolved Claude regressions.
270+
Interim default: keep Milestone 5 conditional and deferrable with explicit rationale.
271+
272+
7. Router package extraction timeline.
273+
Reason: extraction before boundary hardening can lock unstable APIs and increase migration cost.
274+
Verification signal: contract-backed boundary survives real usage with limited churn and clear adapter seams.
275+
Interim default: keep extraction as a post-phase decision, not a phase commitment.
276+
277+
Review cadence:
278+
279+
* Re-evaluate this list at the end of each milestone.
280+
* Promote a deferred item to a committed design decision only when its verification signal is met and documented.
281+
282+
Decision log template:
283+
284+
Use this template whenever a deferred commitment is reviewed, promoted, or explicitly kept deferred.
285+
286+
```text
287+
Decision ID: DEC-YYYY-MM-DD-<slug>
288+
Related deferred item: <item number and title>
289+
Status: proposed | committed | deferred | rejected
290+
Date: <YYYY-MM-DD>
291+
Owners: <names>
292+
293+
Context:
294+
- What was uncertain and why this decision is being reviewed now.
295+
296+
Options considered:
297+
- Option A: <summary>
298+
- Option B: <summary>
299+
- Option C: <summary>
300+
301+
Tradeoffs:
302+
- Option A: <key pros/cons>
303+
- Option B: <key pros/cons>
304+
- Option C: <key pros/cons>
305+
306+
Verification signal:
307+
- Expected signal from this phase plan:
308+
- Evidence observed:
309+
310+
Decision:
311+
- Chosen option:
312+
- Scope of commitment:
313+
- What remains intentionally deferred:
314+
315+
Consequences:
316+
- Near-term implementation impact:
317+
- Test and replay impact:
318+
- Migration impact:
319+
320+
Follow-up:
321+
- Next review milestone:
322+
- Linked artifacts (logs, fixtures, docs, PRs):
323+
```
324+
325+
Storage guidance:
326+
327+
* Keep entries in a simple decision log file under docs (for example, `docs/decision-log.md`).
328+
* Link each entry back to the affected milestone and deferred-item number in this plan.
329+
* Do not treat a design as committed unless a completed decision entry exists.
330+
331+
## Exit Criteria
332+
333+
This phase is complete when all of the following are true:
334+
335+
1. The router contract document exists and matches implementation reality.
336+
2. Session-aware routing decisions are controlled by a real session controller rather than only prompt keywords.
337+
3. The Claude workflow is cleanly downstream of router outputs.
338+
4. Explain and log artifacts use stable contract-backed shapes.
339+
5. Handoff/context-transfer is represented and validated across contracts, Claude workflow mapping, and explain/log evidence.
340+
6. If Milestone 5 gate is met and executed, a second integration path demonstrates boundary reuse; otherwise, deferral is explicitly documented with rationale.
341+
7. The repo is in a credible position to decide whether to extract `@model-switchboard/router`.
342+
343+
## What Success Enables Next
344+
345+
If this phase succeeds, the project can make a high-confidence choice among the next strategic options:
346+
347+
1. extract and publish `@model-switchboard/router`
348+
2. add a second workflow product on top of the same router boundary
349+
3. build advisory integrations for surfaces where direct execution control is unavailable
350+
4. invest in richer policy evaluation and eventually learned routing
351+
352+
Without this phase, expansion to more surfaces would likely increase complexity faster than product value.

0 commit comments

Comments
 (0)