Skip to content

Latest commit

 

History

History
66 lines (48 loc) · 7.24 KB

File metadata and controls

66 lines (48 loc) · 7.24 KB

Harness module registry and retirement signals

This document defines a metadata-only registry for RunContract Harness modules and their retirement signals. It is an observational governance ledger, not a runtime plugin framework.

Purpose

The registry helps maintainers see which harness modules exist, what each module owns, how its quality is evaluated, and when it may be replaced or retired. It stays documentation-oriented until a separate implementation issue explicitly grants stronger authority.

Non-authority boundary

The registry must not load modules at runtime, select implementations dynamically, block workflow completion, override workflow validation or human approval, mutate .kapi / .ilchul / GitHub / worker state, or become a plugin manifest by accident.

Runtime behavior remains owned by source modules, workflow definitions, evidence records, validation rules, and adapter surfaces. The registry only records what a supervisor should inspect.

Registry record format

Field Meaning
id Stable semantic identifier such as intent-parsing, state-tracking, evidence-extraction, quality-scoring, replanning, verifier-assistance, or report-formatting.
owner_surface Owning layer: domain, application, adapter, presentation, docs, or external-adapter.
purpose Short harness responsibility.
current_sources Files, docs, tests, or issue references that embody the module.
quality_signals Signals from docs/runcontract-harness-evaluator.md that indicate whether the module is helping.
retirement_signals Evidence that the module should be merged, replaced, narrowed, or removed in a future scoped change.
replacement_notes Candidate successor or consolidation path.
regression_evidence Checks or review evidence required before changing the module.
authority_level Initial value must be metadata-only; stronger authority needs an approved design issue.
last_reviewed Optional date, issue, or PR reference for the latest registry review.

Initial module map

id owner_surface purpose quality signals retirement signals regression evidence authority
intent-parsing presentation Parses human commands and agent tool inputs into explicit workflow or support-command requests. Sources: src/presentation/parsers.ts, src/presentation/commands.ts, src/presentation/tools.ts, parser/CLI tests. Objective clarity; regression protection for /kapi-*, kapi, and support commands. Duplicate parsing rules drift across CLI/slash/tools; parser wording starts carrying adapter authority such as GitHub merge decisions. Targeted parser/CLI tests; README command examples aligned with implementation. metadata-only
state-tracking domain Maintains lifecycle, phase, artifact-root, active workflow, and worker-boundary state. Sources: src/domain/state-machine.ts, src/domain/workflows.ts, src/domain/workflow-validation.ts. Evidence integrity; context hygiene so generic state does not absorb GitHub/Discord/review-bot semantics. A state field becomes presentation-only or duplicates artifact/evidence truth; storage roots change without migration review. Lifecycle/validation tests; storage migration design review for .kapi/.ilchul behavior. metadata-only
evidence-extraction domain Projects artifacts, command records, verifier reviews, and workflow results into evidence expectations. Sources: src/domain/run-contract.ts, src/domain/workflow-validation.ts, evidence tests. Evidence integrity; verifier independence. Evidence is accepted by filename, unchecked prose, or stale refs alone; evidence helpers make workflow-specific completion decisions that belong to validation. Evidence shape and closeout validation tests; negative stale/insufficient evidence cases when behavior changes. metadata-only
quality-scoring domain Exposes advisory RunContract quality hints and dimensions without becoming completion authority. Sources: src/domain/run-contract.ts, docs/runcontract-harness-evaluator.md, RunContract tests. Anti-Goodhart resilience; human override. Scores become hidden hard blocks; metrics reward green-looking output while artifacts or evidence become less useful. RunContract pass/warn/fail quality tests; PR statement that advisory quality remains separate from completion authority. metadata-only
replanning application Helps workflows decide next steps after validation, blockers, stale state, or quality warnings. Sources: src/application/workflow-service.ts, src/domain/workflows.ts, skills/kapi-workflow/SKILL.md. Objective clarity; human override for design-sensitive decisions. Replanning text repeats validation logic without supervisor value; next-step guidance mutates state when it should only advise. Status/worker-plan tests; design review before advisory text becomes mutation. metadata-only
verifier-assistance application / adapter Supports independent verifier or reviewer review without producer self-approval. Sources: src/workers/deep-interview-readiness-worker.ts, src/application/github-run-contract-adapter.ts, docs/kapi-agent-approval-gate.md. Verifier independence; current-run/head/artifact freshness. Verifier assistance is treated as merge/completion authority outside its owning adapter; review freshness is summarized away from audit evidence. Readiness and GitHub adapter tests; current-head freshness checks for PR adapter changes. metadata-only
report-formatting presentation Renders compact status, report, and RunContract summaries from already-computed state. Sources: src/presentation/run-contract-format.ts, src/presentation/runctl-formatters.ts, src/presentation/messages.ts. Artifact usefulness; context hygiene. Formatting helpers construct JSON payload truth instead of presenting existing data; output becomes too verbose for supervisor scan use. Direct formatter regression tests; README/help output aligned when wording changes. metadata-only

Retirement review process

A module retirement or replacement proposal should include:

  1. module id and owner surface;
  2. observed retirement signal;
  3. affected evaluator checklist dimensions;
  4. proposed replacement or consolidation path;
  5. compatibility and rollback notes;
  6. exact regression evidence required before merge;
  7. explicit authority classification: metadata-only, advisory, or behavior-changing.

Level 2 or Level 3 changes from the evaluator checklist need a separate design issue and human approval before implementation. Runtime loading, automatic retirement, storage-root flips, or hard gating are Level 3 by default.

Verification checklist

For metadata-only registry updates:

  • The update changes documentation only.
  • The record references the RunContract evaluator checklist where quality or retirement is discussed.
  • The record does not introduce runtime plugin/loading behavior.
  • The record does not make advisory signals completion authority.
  • README layout links remain current.

For future behavior-changing module work, add targeted tests for the owning source surface and preserve backward compatibility unless the linked migration issue explicitly approves a break.