Kapi is a Pi-native thin harness that exposes a small number of explicit, durable workflow modes.
Kapi does not wrap ordinary Pi work. Ordinary Pi remains ordinary. Kapi activates structured state, artifacts, approvals, workers, verification, and worktree ownership only when the user explicitly enters a Kapi mode.
Core rule:
Kapi exposes durable modes, not workflow phases.
Kapi should be thin by default and structured on demand.
- When no Kapi mode is attached to the current Pi session, Kapi should stay nearly transparent.
- Kapi should not force ordinary Pi work into a workflow.
- Kapi should activate state, artifacts, workers, verification, skill guidance, and visualization only through explicit Kapi modes.
- Kapi should avoid becoming a general session manager, task manager, external orchestration runtime, or replacement Agent OS.
A Kapi feature is only thin if it improves explicit-mode reliability without changing the cost or shape of ordinary Pi work.
Durable mode commands:
/kapi-deep-interview
/kapi-ralph
/kapi-autoresearch
/kapi-integrate
Global support commands:
/kapi-status
/kapi-clear
Mode subcommands:
/kapi-<mode> status
/kapi-<mode> resume <index|slug>
/kapi-<mode> approve <index|slug>
There are no separate dashboard, report, pause, stop, or global resume commands.
Removed commands must be fully removed from the registry. Kapi must not keep aliases, redirects, compatibility shims, or fallback handlers for obsolete workflows.
Removed or absorbed commands:
/kapi-clarify
/kapi-plan
/kapi-execute
/kapi-review
/kapi-tdd
/kapi-ultrawork
/kapi-autopilot
/kapi-ralplan
/kapi-autoresearch-plan
/kapi-autoresearch-loop
/kapi-resume
Deep Interview is the context creation mode.
It gathers:
- user intent;
- constraints;
- preferences;
- success criteria;
- methods;
- decision boundaries;
- open questions.
Deep Interview is not merely embedded inside Ralph or Autoresearch. It is a first-class mode whose approved artifacts may later link into Ralph or Autoresearch.
Flow:
deep-interview
→ approved interview/spec
→ branch decision
→ ralph | autoresearch
Deep Interview behavior should use references/ouroboros as the primary reference for eliciting human values, intent, context, constraints, and success criteria through interview-driven clarification.
Ralph is the governed build mode.
Flow:
deep-interview-linked context
→ ralplan-style consensus planning
→ user approval
→ execute/review loop
→ closeout
Ralplan is no longer a separate command. It becomes Ralph's planning phase.
Planning behavior:
- use thesis/antithesis/synthesis consensus;
- run at most five attempts;
- stop early when consensus is reached;
- present the plan to the user;
- proceed only after approval.
Parallel execution is not a separate workflow. Former Ultrawork behavior becomes a Ralph execution strategy when the approved plan contains independent tasks with explicit dependency edges and isolation needs.
Autoresearch is the governed metric optimization mode.
Kapi owns the autoresearch engine internally. It does not delegate core loop ownership to pi-autoresearch, but pi-autoresearch remains the behavioral reference.
Flow:
deep-interview-linked context
→ experiment contract
→ user approval
→ autonomous experiment loop
→ finalize/closeout
Kapi Autoresearch owns:
- experiment contract creation;
- benchmark execution;
- checks;
- keep/discard/crash/checks_failed decisions;
- commits;
- reverts;
- resume;
- summaries;
- finalize;
- ledger and artifact management.
Autonomy boundary:
Kapi Autoresearch may run autonomously only inside an explicitly approved experiment boundary. It must not become a general autonomous supervisor for ordinary Pi work.
Keep/discard policy follows pi-autoresearch:
primary metric improved + checks pass → keep/commit
worse or equal → discard/revert
crash → discard/revert
checks_failed → discard/revert
confidence score → advisory only
Kapi artifact names are Kapi-owned, with explicit reference mapping:
Kapi Autoresearch pi-autoresearch reference
contract.md autoresearch.md
benchmark.sh autoresearch.sh
ledger.jsonl autoresearch.jsonl
ideas.md autoresearch.ideas.md
checks.sh autoresearch.checks.sh
Artifact layout:
.ilchul/workflows/autoresearch/<index-slug>/
contract.md
benchmark.sh
ledger.jsonl
ideas.md
checks.sh
verify.md
decision-report.md
Integrate is the governed branch integration mode.
It exists because branch integration is a cross-workflow safety boundary, not merely a Ralph closeout phase.
Purpose:
Kapi-created workflow branches
→ merge plan
→ artifact/evidence comparison
→ integration worktree
→ verification
→ consensus gate
→ merge to dev
/kapi-integrate may directly perform branch changes, commits, cherry-picks, rebases, and merges under defined conditions.
Constraints:
- only Kapi-owned branches and worktrees;
- target branch is always
dev; - never merge directly to
main; - approved merge plan required before mutation;
- verification must pass;
- source workflow artifacts and evidence must be compared against the merge result;
devmerge allowed only after consensus approval.
Integration must check source artifacts not merely for presence, but for intent coverage:
- Ralph plan artifacts;
- Autoresearch contract artifacts;
- completion evidence;
- verification evidence;
- excluded scope;
- missed work;
- wrong merges;
- duplicate or conflicting changes.
Required integrate artifacts:
.ilchul/workflows/integrate/<index-slug>/
state.json
events.jsonl
snapshot.json
merge-plan.md
integration-report.md
decision-report.md
conflict-matrix.md
verify.md
Before merging to dev, Integrate must run a consensus gate:
thesis = defend merge correctness
antithesis = find missed merges, wrong merges, artifact gaps, conflict risks
synthesis = approve | revise | block
Maximum attempts:
3
If synthesis returns revise, Kapi returns to the narrowest relevant phase:
scope/order issue → merge planning
conflict resolution bug → conflict resolution
verification gap → verification
unclear source intent → artifact inspection / user clarification
After successful dev merge:
integrate workflow → completed
source workflows → integrated
worktrees → preserved
Worktrees are never deleted automatically. Status/health may list cleanup candidates.
Kapi follows an OMX-inspired command/event/snapshot model.
semantic command
→ validated transition
→ append-only event
→ latest snapshot
Principles:
- commands express semantic requests;
- events record accepted transitions and decisions;
- snapshots are the current truth for status, resume, UI, hooks, and skill injection;
- denied transitions must not append events or mutate snapshots;
- terminal states must not be accidentally reactivated;
- readiness is derived from durable state, not UI inference.
Kapi borrows OMX state-machine semantics, not OMX's full runtime weight.
A project may have multiple active or resumable Kapi workflows at the same time. Kapi should track them through a project-scoped active inventory rather than a single active pointer.
Each workflow owns its own artifact directory and, when commit-capable, its own Kapi-created worktree branch. Concurrent workflows are normal; integration correctness is handled by /kapi-integrate rather than by forbidding parallel work.
The project .ilchul state should include a durable active inventory of current workflow attachments, active workflows, resumable workflows, and linked workflow relationships. Status, resume, skill injection, and health checks should read from this inventory and the workflow snapshots, not from ad-hoc UI state.
Workflow state must preserve cross-mode linkage. A Deep Interview that branches into Ralph or Autoresearch, and source workflows that later feed Integrate, should remain connected through durable state fields and events rather than through conversational memory. This cross-mode state design should follow the OMX active-inventory and state-linking references where useful.
/kapi-status is the global inspection surface for current snapshots, event logs, health, active inventory, and cleanup candidates.
Mode-specific status commands show sorted workflow lists for that mode:
/kapi-deep-interview status
/kapi-ralph status
/kapi-autoresearch status
/kapi-integrate status
Status listings sort by the canonical workflow directory name, including the stable zero-padded index prefix.
Mode-specific resume commands may resolve either a workflow index/slug or a linked interview index/slug:
/kapi-ralph resume 002
/kapi-ralph resume auth-session-recovery
/kapi-ralph resume 001-kapi-unified-state-machine
When a linked interview target is used, the canonical resumed target is always the linked Ralph or Autoresearch workflow. Already linked interviews should guide the user to resume the linked workflow instead of creating a new branch decision.
Commands change state. State determines skill injection.
Skill injection must be derived deterministically from the current snapshot:
mode + phase + subphase → skill guidance
Transition commands must not independently choose hidden skill behavior.
Examples:
mode=deep-interview, phase=interview
→ inject deep-interview guidance
mode=ralph, phase=plan
→ inject ralplan guidance
mode=ralph, phase=run
→ inject ralph guidance
mode=ralph, phase=review
→ inject code-review + code-simplifier guidance
mode=autoresearch, phase=contract
→ inject autoresearch contract guidance
mode=autoresearch, phase=run
→ inject autoresearch loop guidance
mode=integrate, phase=consensus
→ inject integration review/consensus guidance
mode=integrate, phase=review
→ inject integration review + code-review + code-simplifier guidance
Review phases should inject both code-review guidance and code-simplifier guidance. Review must first check correctness, regressions, missing tests, artifact alignment, and verification gaps; then it may apply or recommend behavior-preserving simplification when the diff is unnecessarily complex.
Kapi should not introduce fine-grained phases unless needed for:
- skill routing;
- recovery;
- validation;
- approval gates;
- evidence;
- user-visible progress.
approve has one command meaning:
approve the current
pendingDecisionrecorded in the workflow snapshot.
The concrete transition is determined by mode, phase, and pendingDecision.
Examples:
pendingDecision=approve_interview_spec
→ /kapi-deep-interview approve 001
pendingDecision=approve_ralph_plan
→ /kapi-ralph approve 002
pendingDecision=approve_experiment_contract
→ /kapi-autoresearch approve 003
pendingDecision=approve_merge_plan
→ /kapi-integrate approve 004
Revision or requested changes happen through normal conversation and are recorded as events.
Kapi does not model pause or stop as server-style background controls. Pi operations stop through ordinary Pi interruption such as Esc / operation aborted.
/kapi-clear only detaches the current Pi session from Kapi mode. It does not cancel, delete, archive, or remove the underlying workflow.
Workflow artifact directories use stable zero-padded project-local indexes plus short English slugs.
Example:
001-kapi-unified-state-machine
002-auth-session-recovery
003-benchmark-engine-migration
Rules:
- index is project-local and stable;
- slug is one to five English keywords;
- slug uses kebab-case;
- slug is human-readable;
- slug is stable once assigned;
- status lists sort by canonical directory name;
- numeric inputs resolve to stable prefix indexes, not transient display rows.
Examples:
/kapi-ralph resume 002
/kapi-ralph resume auth-session-recovery
Collision handling should prefer a more specific meaningful slug before mechanical suffixes.
Approved Deep Interview artifacts may link to Ralph or Autoresearch.
When the user runs:
/kapi-ralph <purpose>
/kapi-autoresearch <purpose>
Kapi searches approved, unlinked interviews for similar candidates.
If multiple candidates match, Kapi presents a numbered selection list.
Already linked interviews must not appear as candidates for new branch decisions. Users should resume the linked workflow instead.
Linked interview state records:
- linked mode;
- linked workflow id or slug;
- branch decision event;
- linked status.
Project-specific source of truth lives under the project:
<project>/.ilchul/workflows/
Example:
.ilchul/workflows/deep-interview/001-kapi-unified-state-machine/
.ilchul/workflows/ralph/002-auth-session-recovery/
.ilchul/workflows/autoresearch/003-benchmark-engine-migration/
.ilchul/workflows/integrate/004-auth-test-integration/
Every workflow directory should contain the shared state records unless a mode-specific contract explicitly says otherwise:
state.json
events.jsonl
snapshot.json
Mode-specific artifacts live alongside those shared records in the same workflow directory.
Global Kapi worktrees live under:
~/.ilchul/worktrees/
Kapi-created worktrees are execution surfaces only. Durable workflow state, artifacts, events, evidence, and snapshots remain anchored to the original project's .ilchul/workflows.
Any Kapi mode that may commit, revert, reset, or run autonomous mutation must operate inside a Kapi-created worktree branch.
Kapi must never commit, revert, discard, or reset user changes outside its owned worktree boundary.
Branch names follow Commitizen / Conventional Commit prefixes:
<type>/<mode>-<goal-slug>
Examples:
feat/ralph-unified-state-machine
fix/ralph-target-resolution
perf/autoresearch-quality-budget-loop
refactor/ralph-shared-engine
docs/ralph-goal-md-update
chore/integrate-auth-session-test-speed
Allowed types:
feat | fix | perf | refactor | test | docs | chore | build | ci
Branch flow:
main → dev → feat/*|fix/*|perf/*|... → dev → human review/QA → main
Kapi may merge to dev through /kapi-integrate. Kapi must never complete dev → main without human review and QA.
Kapi-created workers are part of the shared Kapi substrate, not one-off workflow hacks.
A Kapi worker record should track:
- worker id;
- owning workflow;
- capability kind;
- workspace;
- terminal/session handle if any;
- isolated workspace handle if any;
- current status;
- recent output or notes;
- dispatch history where useful;
- closeout behavior.
Kapi should support worker operations when a mode needs them:
- inspect local capabilities;
- plan worker strategy before creating resources;
- prepare tmux and/or git worktree workers;
- dispatch work to a Kapi-created terminal worker;
- refresh worker status and recent output;
- attach worker output or result summaries to evidence;
- close or mark workers when the owning workflow reaches a terminal state.
Kapi must not manage arbitrary external tmux sessions, worktrees, processes, or subagents it did not create.
Important decisions produce:
decision-report.md
and structured decision events/evidence.
Decision reports should include:
scope
inputs/artifacts inspected
conditions checked
options considered
chosen decision
rationale
verification refs
risks
follow-up
For integration, reports must also include:
source workflow inventory
intent mapping
diff-to-intent coverage
unexpected changes
conflict resolution options
chosen conflict resolution
merge order and rationale
integration verification
final verdict
Core rule:
High-quality integration requires proving that intended work was included, unintended work was excluded or justified, conflicts were resolved with recorded rationale, and integration-level verification passed.
Kapi evidence must remain human-readable, but validation-critical modes should prefer structured evidence when possible.
Every evidence record should support:
- kind;
- summary;
- ref.
Evidence may additionally include:
- command;
- exit code;
- role;
- verdict;
- phase;
- artifact refs;
- worker id;
- timestamp;
- verifier identity or approval source.
Kapi should not treat narrative claims as equivalent to validation evidence when a mode's completion depends on proof.
Human command surfaces should remain practical and forgiving.
Kapi commands should support realistic input where practical:
- quoted arguments;
- multi-word summaries;
- multiline artifact content;
- clear errors for malformed input;
- compact help for advanced subcommands.
Agent tools may use stricter structured schemas, but human commands should not require brittle syntax memorization.
Humans should be able to operate Kapi modes without switching from workflow thinking into parser debugging.
Kapi implementation must follow clean architecture.
Principles:
- Domain logic should not depend on Pi APIs, tmux, git, or the filesystem.
- Application use cases should coordinate domain behavior.
- Adapters should implement Pi extension integration, filesystem storage, tmux, git worktree, git branch operations, command execution, and other external capabilities.
- Presentation should own commands, tools, hooks, parameter schemas, parsing, formatting, and visualization.
- Mode implementations should be built on a shared core model rather than duplicating lifecycle logic.
Presentation should adapt Kapi to Pi; it should not become the place where Kapi's state machine, validation policy, or worker model lives.
Refactoring is required when the implementation becomes too difficult to verify as a thin harness, even if behavior still works.
Good refactoring should:
- keep domain behavior pure and application behavior port-driven;
- split by stable responsibilities before introducing abstractions;
- preserve command behavior, artifact formats, state compatibility within the new architecture, and thin-default semantics;
- reduce coupling and duplication without hiding simple Pi extension registration behind a framework;
- add or preserve tests that prove behavior.
Bad refactoring includes:
- centralizing Kapi into a broad runtime manager;
- adding plugin or command frameworks before repeated structure proves they are needed;
- moving workflow policy into Pi presentation modules;
- making ordinary Pi turns pass through new Kapi control layers;
- increasing indirection without improving verification, ownership boundaries, or maintainability.
Refactor to keep Kapi small, inspectable, and behaviorally stable; do not refactor Kapi into the heavy system it is meant to avoid.
Kapi should inspect reference implementations before inventing new internal structures.
References:
-
references/oh-my-codex- command/event/snapshot state machine;
- active inventory;
- cross-mode state linkage;
- denied transition without mutation;
- status/snapshot reporting;
- recovery contracts.
-
references/ouroboros- deep interview behavior;
- human values, intent, context, constraints, and success criteria extraction;
- interview-driven clarification before planning or execution.
-
references/*- repository structure;
- command organization;
- artifact layout;
- workflow boundaries;
- test strategy.
-
pi-autoresearch- experiment loop behavior;
- benchmark/checks semantics;
- keep/discard/crash/checks_failed decisions;
- ledger behavior;
- confidence scoring;
- finalize behavior.
Kapi may borrow implementation structure where it reduces ambiguity or duplication, but must not copy unrelated runtime weight or install machinery.
This redesign starts from the new architecture.
Old .kapi layouts and obsolete workflow commands are legacy-only. Existing old .kapi state must not be removed without explicit cleanup authorization.
Kapi must not preserve obsolete workflows as fallback, legacy, compatibility, or shadow paths.
Core rule:
Delete replaced architecture. Do not keep obsolete workflows as hidden fallback paths.
Kapi-owned autonomy is allowed only inside explicitly activated and approved governed modes.
Kapi commit/revert authority is limited to Kapi-created worktrees, Kapi-owned branches, and approved mutable scope.
Kapi workers and worktrees are mode-scoped, inspectable, and evidence-friendly.
Kapi must not manage arbitrary external workers, tmux sessions, processes, branches, or worktrees it did not create.
Core rule:
Kapi keeps ordinary Pi ordinary. Structure appears only when the user explicitly enters Deep Interview, Ralph, Autoresearch, or Integrate.
Kapi validation should prove not only that behavior works, but also that the code remains testable, understandable, low-duplication, and loosely coupled.
The executable local verification source of truth remains:
npm run verify
When an autoresearch-style scoring script is needed for architecture work, bash autoresearch.sh should route to the same verification boundary instead of inventing a divergent gate.
Recommended checks:
- inactive transparency;
- explicit activation only;
- command/event/snapshot transition behavior;
- denied transition without mutation;
- deterministic skill injection;
- project-anchored artifacts;
- Kapi-owned worktree boundaries;
- evidence-backed approval gates;
- integration diff-to-intent verification;
npm run verify.
Kapi quality reporting should continue to track these maintainability and semantic indicators:
- Code coverage;
- Cyclomatic complexity;
- Duplicated code;
- Code smells;
- Dependency and coupling, including
export ... fromfacade fanout, total module edges, and multi-edge facade counts so support barrels cannot make broad dependencies look artificially thin; - Semantic Autoresearch consistency, including bridge-term misuse, root
autoresearch.*dependency count (autoresearch.md,autoresearch.sh,autoresearch.checks.sh,autoresearch.jsonl,autoresearch.ideas.md,autoresearch.config.json), durable artifact mismatch count, and source-of-truth conflicts; - pi-autoresearch reference-role coverage, including contract, benchmark, checks, ledger, ideas, keep/discard/crash/checks_failed, metric parsing, and resume/reconstruction semantics;
- Runtime Autoresearch start diagnostics that prove
/kapi-autoresearchcan start from a clean workspace and create the Kapi-owned durable artifact contract under.ilchul/workflows/autoresearch/; - Cross-mode runtime readiness probes for Deep Interview, Ralph, and Integrate start contracts;
- Event/snapshot semantic probes that parse
events.jsonl,snapshot.json, andstate.jsoninstead of accepting filename presence; - Human command-surface diagnostics for exact durable commands and required mode subcommands (
status,resume,approve); - Separate readiness/blocker diagnostics so runtime, event/snapshot, command-surface, semantic ownership, artifact, and source-of-truth blockers remain visible even when the architecture/maintainability score is high.
Completeness and redesign work should preserve these validation evidence classes while the implementation catches up to the new durable-mode architecture:
- P0 trust: local verification commands pass and are documented;
- P1 governed safety: governed modes cannot proceed past approval or closeout gates without required artifacts and evidence;
- P2 phase fidelity: mode phases enforce the artifact, skill-injection, and approval rules they claim;
- P3 thinness: no-mode behavior stays transparent and Kapi does not create state, artifacts, workers, or blocking hooks without explicit activation;
- P4 operator usability: human commands and agent tools can inspect, approve, resume, detach, and clear with clear feedback;
- P5 maintainability: code quality metrics stay within budget and architecture boundaries remain test-covered.
Kapi is a thin Pi-native harness that exposes four durable modes:
- Deep Interview for context creation;
- Ralph for governed build execution;
- Autoresearch for governed metric optimization;
- Integrate for governed branch integration.
These modes share an OMX-inspired command/event/snapshot state model, deterministic skill injection, project-anchored artifacts, Kapi-owned worktree boundaries, and evidence-backed approvals.