You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This roadmap is the index and dependency map for Ilchul's next runtime direction: a portable, objective-driven, learning E2E execution harness.
Ilchul should decompose work, select an execution policy, run work through a safe graph-based runtime, evaluate outcomes against explicit objectives, integrate only gated results, record reward/prediction deltas, and improve future policy selection through advisory calibration.
#167 owns structure, boundaries, dependency order, MVP scope, and child-issue routing. Detailed schemas and phase semantics live in sub-issues.
Ilchul is a portable, objective-driven, learning E2E execution harness. It treats agent work as policy-driven experiments, runs selected strategies through a safe TaskGraph runtime, evaluates results against explicit objectives, and records prediction-vs-actual outcomes so future execution policies improve.
Core Decisions
Unified Run lifecycle. Deep Interview, Ralph, Autoresearch, and Integrate are compatibility bundles over one Run lifecycle, not separate top-level runtime models.
Every Run has a RunObjective. Intake may start with draft RunObjective, but Execute cannot start until RunObjective is approved.
Objective taxonomy is split.RunObjective is this run's success/failure/repair criteria. LearningObjective is long-term policy improvement. PolicyUtility is pre-dispatch strategy utility.
Full-lifecycle Strong PhasePreset. Intake through Close all have phase presets. Presets define protocol; RunObjective may strengthen but not weaken them.
Learn follows Integrate. Reward/calibration must observe integration cost, conflicts, cleanup/retention, and final evidence before learning.
TaskGraph is a runtime primitive. Single-agent, sequential, DAG-parallel, and team-parallel runs all execute through TaskGraph. Team/parallel is policy, not a different runtime model.
PolicySelection chooses strategy. It may create graph sketches for simulation, but concrete TaskGraph creation belongs to graph-execution.
Graph-execution owns concrete TaskGraph creation. It uses approved RunObjective plus selected PolicySelection, then applies readiness, claim, lease, worker, and evidence gates.
GateEngine and Verifier are separate. Verifier validates evidence. GateEngine decides whether transitions are allowed.
Three-layer gates. Transitions require HardInvariantGate, PhasePresetGate, and RunObjectiveGate to pass.
Evidence required completion. Agent claims are not authority. Task, phase, and run completion require evidence.
Fail closed. Gate failure denies without mutation unless a later explicit recovery transition records the repair path.
Policy hints are advisory only. Reward/calibration may produce hints, but actual strategy changes must be recorded by PolicySelection events.
Run closes by sealing. A run is not complete until evidence, artifacts, RewardRecord, cleanup/retention state, and replay/audit events are sealed.
RL-shaped roadmap, MVP-shaped implementation. The conceptual structure should point toward reinforcement-learning-style policy improvement, but MVP implementation should remain a minimal calibration-ready graph runtime.
Thin uniform phase engines. The eight lifecycle phases may have one engine each, but every PhaseEngine must follow the same thin contract and produce phase outputs only.
Authority stays in shared runtime. Phase engines do not own RunState mutation, transition authority, or evidence authority. RunOrchestrator, GateEngine, Verifier, EventStore, and RunStateStore own those responsibilities.
Side effects are runtime-owned. Phase engines return SideEffectRequest values; SideEffectRunner executes allowlisted actions only after durable transition intent is committed.
Index-first Runtime State. RunState, PhaseStatus, TaskGraph, RuntimeTask, WorkerState, ClaimLease, and SideEffectRecord are operational indexes, not payload stores.
Do not create an engine for every noun. Separate runtime components only when behavior, state, or failure modes truly differ. Phase engines are acceptable when they share one contract and stay thin.
Canonical Runtime Terms
Use these exact terms in child issues, docs, schemas, tests, and implementation:
Canonical term
Meaning
RunObjective
Approved run success/failure/repair criteria.
TaskGraph
Execute-phase graph primitive for single-agent, sequential, DAG-parallel, and team-parallel runs.
RuntimeTask.dependsOn
RuntimeTask dependency field.
WorkerState
Worker operational state object.
ClaimLease
Task ownership and lease record.
EvidenceRef
Evidence reference required for task, phase, and run completion.
EvaluationResult
Objective evaluation output.
IntegrationCandidate
Ref-backed candidate for gated integration.
RewardRecord
Individual reward/prediction/calibration record.
PolicyHint
Advisory policy signal. Policy changes must still be recorded by PolicySelection.
SideEffectRequest
Requested external action returned by a PhaseEngine.
SideEffectRecord
Runtime-owned operational state for side-effect execution and result.
Closed design issues define the language.
MVP-critical open issues define the first runtime.
Shallow-first issues record signals without autonomous behavior.
Post-MVP issues harden convergence after #196 exists.
dispatch of claimed ready tasks through the adapter/substrate boundary;
heartbeat tracking and stale/unhealthy projection;
structured worker report capture;
EvidenceRef extraction from reports, logs, test output, diffs, or artifacts;
evidence-required task completion;
operator-visible task/worker status.
Exit gate: smoke run executes at least two independent tasks in parallel through fake workers or a documented local substitute and completes only with evidence refs.
Ilchul should follow OMX-style discipline in Ilchul terms:
RunState is source of truth.
Agents produce evidence, not authority.
Every phase transition is gated.
Gate failure denies without mutation.
Durable state is written before side effects.
Parallel work requires claim and lease.
Readiness is an explicit snapshot.
Completion requires evidence.
Strong PhasePreset defines phase protocol.
RunObjective can strengthen but not weaken.
Symbolic plans remap to concrete runtime ids.
Recovery is explicit and inspectable.
MVP Minimality Rule
The roadmap defines the conceptual contract. MVP implementation should collapse concepts into the smallest runtime surface that preserves:
RunObjective approval before execution;
TaskGraph as the execute primitive;
gate/evidence-required transitions;
claim/lease for parallel work;
RewardRecord records;
sealed close.
MVP learning is calibration-ready, not fully self-optimizing. It records predictions, actual outcomes, deltas, and advisory PolicyHint values. It does not automatically mutate policy or objective weights.
Phase count does not imply heavyweight components. The full lifecycle may have eight thin PhaseEngines if they all share one contract and delegate common concerns to shared runtime services.
Minimality rule for phase engines:
PhaseEngine produces phase outputs, evidence refs, blockers, and proposed patches.
PhaseEngine does not mutate RunState directly.
PhaseEngine does not decide transition authority.
PhaseEngine does not verify its own completion.
RunOrchestrator records events and advances phases.
Verifier validates evidence.
GateEngine decides transitions.
RunState snapshot is the MVP operational source of truth; EventStore is append-only audit/replay support.
commitTransition(patch, event) records durable transition intent before external side effects.
SideEffectRunner executes only allowlisted, idempotency-keyed side effects.
Operational state stores status/version/refs/blockers/timestamps; payloads live behind ArtifactRef, EvidenceRef, EventStore, or external refs.
Acceptance Gates
Before runtime implementation becomes authoritative:
RunContract core remains free of GitHub/PR/Ragna/Discord/kapi-agent semantics.
Runtime state schemas and event replay behavior are defined.
Verification matrix exists before implementation PRs claim runtime readiness.
MVP implementation demonstrates the minimality rule: phase engines are thin and uniform, while shared runtime services own state, gate, evidence, event, and persistence authority.
External actions are not hidden inside PhaseEngines; they are represented as SideEffectRequest records and executed through SideEffectRunner after transition commit.
Open Questions
These should become issues only if the need remains after active design tracks mature.
2026-05-18 — Added MVP Minimality Rule: keep the roadmap RL-shaped but implement the MVP as a compact calibration-ready graph runtime.
2026-05-18 — Added Issue Tiers to separate MVP-critical open work, closed design records, shallow-first learning/policy work, and post-MVP integration/repair hardening.
2026-05-18 — Refined minimality rule: keep eight thin uniform PhaseEngines if useful, but centralize state mutation, transition authority, evidence validation, events, and persistence in shared runtime services.
2026-05-18 — Added Index-first Runtime State rule: operational state is compact status/version/ref/blocker/timestamp index; payloads live in artifacts, evidence, events, or external refs.
Roadmap: objective-driven learning parallel runtime harness
Purpose
This roadmap is the index and dependency map for Ilchul's next runtime direction: a portable, objective-driven, learning E2E execution harness.
Ilchul should decompose work, select an execution policy, run work through a safe graph-based runtime, evaluate outcomes against explicit objectives, integrate only gated results, record reward/prediction deltas, and improve future policy selection through advisory calibration.
#167 owns structure, boundaries, dependency order, MVP scope, and child-issue routing. Detailed schemas and phase semantics live in sub-issues.
North Star
Target identity:
Core Decisions
RunObjectiveis this run's success/failure/repair criteria.LearningObjectiveis long-term policy improvement.PolicyUtilityis pre-dispatch strategy utility.Canonical Runtime Terms
Use these exact terms in child issues, docs, schemas, tests, and implementation:
RunObjectiveTaskGraphRuntimeTask.dependsOnWorkerStateClaimLeaseEvidenceRefEvaluationResultIntegrationCandidateRewardRecordPolicyHintSideEffectRequestSideEffectRecordCanonical worker status strings:
Ownership Map
.ilchulstorage, config, worker retentionTrack Status
.ilchulis forward storage; avoid unsafe.kapimutation; define portable adapters.Issue Tiers
The number of child issues should not imply that all of them must be implemented at full strength for MVP. Tiers define implementation pressure.
MVP-critical open
These must be resolved enough to build the compact runtime MVP:
Design records / already closed
These are decision records or seed designs. They guide implementation but should not be read as additional full-strength MVP work:
.ilchulstorage directionShallow-first
These should start as record-only, rule-based, or advisory behavior. They should not become heavy engines before #196 produces real runtime evidence:
Post-MVP hardening
These matter, but should not block the runtime MVP:
Tiering rule:
Dependency Order
Recommended order:
Development Order
This is the implementation order for PR planning. It is more binding than the conceptual track list above when deciding what to build next.
Step 0 — Keep closed design issues as language, not new MVP scope
Do not reopen broad implementation from closed design records unless a later open issue requires it.
.ilchuldirection, but should not expand MVP scope by themselves.Step 1 — Build the thin runtime spine (#185 + #186)
Before adding more behavior, define the smallest shared runtime authority surface:
RunStatesnapshot as compact operational index;RuntimeEventtaxonomy for run, phase, task, worker, evidence, gate, and side-effect transitions;commitTransition(patch, event)semantics with version check and event append;Exit gate: runtime state and event examples exist for one successful run and one repair/stale-worker path.
Step 2 — Lock adapter/substrate boundary (#188)
Define the minimum portable worker contract before worker execution:
AgentAdapterfor launch/send/capture/interrupt/report parsing;ExecutionSubstratefor tmux, process, native-subagent, and future substrates;Exit gate: fake adapter/substrate can satisfy the contract without depending on tmux or a real agent.
Step 3 — Finish DAG phase 1: task graph and readiness (#194)
Implement or tighten:
TaskGraphandRuntimeTaskschemas with graph id/version and inspectable task metadata;Exit gate: 5-task fixture with two parallel branches passes validation and readiness tests.
Step 4 — Finish DAG phase 2: claim, lease, and stale ownership (#197)
Implement or tighten:
ClaimLeaseas an operational index or explicitly justified task-local representation;Exit gate: duplicate claim race, expired lease, and stale recovery tests pass.
Step 5 — Finish DAG phase 3: worker execution MVP (#196)
This is the MVP completion boundary.
Implement:
Exit gate: smoke run executes at least two independent tasks in parallel through fake workers or a documented local substitute and completes only with evidence refs.
Step 6 — Define verification matrix before runtime readiness claims (#191)
After Steps 1-5 are implemented enough to be concrete, define the matrix that must pass before changing defaults or claiming runtime readiness:
.ilchul.Exit gate: every MVP invariant in #167 maps to at least one unit, fixture, integration, or smoke test category.
Step 7 — Add shallow-first learning and policy records (#189 + #187)
Only after #196 produces real execution evidence:
Exit gate: prediction-vs-actual records can be emitted from MVP smoke evidence without changing future behavior automatically.
Step 8 — Post-MVP integration and repair convergence (#190 + #195)
Do not block the MVP on this step.
Implement after the worker/evidence MVP is proven:
Exit gate: clean integration and conflict repair fixtures both produce sealed evidence.
Development rule:
Conceptual dependency:
Runtime Discipline
Ilchul should follow OMX-style discipline in Ilchul terms:
MVP Minimality Rule
The roadmap defines the conceptual contract. MVP implementation should collapse concepts into the smallest runtime surface that preserves:
MVP learning is calibration-ready, not fully self-optimizing. It records predictions, actual outcomes, deltas, and advisory PolicyHint values. It does not automatically mutate policy or objective weights.
Phase count does not imply heavyweight components. The full lifecycle may have eight thin PhaseEngines if they all share one contract and delegate common concerns to shared runtime services.
Minimality rule for phase engines:
commitTransition(patch, event)records durable transition intent before external side effects.Acceptance Gates
Before runtime implementation becomes authoritative:
.ilchulstorage behavior is explicit and does not silently mutate existing.kapistate.Open Questions
These should become issues only if the need remains after active design tracks mature.
Current Next Actions
Verification
.ilchulstorage direction.Changelog