Skip to content

feat: add task claim leases#233

Merged
devkade merged 1 commit into
devfrom
feat/issue-197-claim-lease
May 18, 2026
Merged

feat: add task claim leases#233
devkade merged 1 commit into
devfrom
feat/issue-197-claim-lease

Conversation

@devkade
Copy link
Copy Markdown
Owner

@devkade devkade commented May 18, 2026

Summary

  • Adds phase-2 claim/lease domain primitives for TaskGraph execution.
  • Adds deterministic lease token generation, active-claim validation, renew/release recovery paths, and claim/lease runtime event records.
  • Extends task-graph regression coverage for non-ready claim rejection, duplicate ownership, expiry, explicit recovery, and README runtime-boundary alignment.

Linked issue

Closes #197

Problem

Issue #197 asks for DAG runtime phase 2: ready tasks must be claimable by exactly one worker, active leases must gate normal completion, expired leases must become recoverable only through an explicit path, and claim/lease transitions need event records.

Before this PR, src/domain/task-graph.ts had a basic TaskClaim shape and completion token check, but the lease lifecycle was incomplete: token generation was caller-only, renew/release/recovery were absent, expired claims could not be represented as first-class recovery transitions, and claim/lease events were not exposed.

Options considered

  1. Keep only the existing claimTask() / completeTask() checks and document recovery as future work.
    • Pros: smallest possible diff.
    • Cons: leaves most phase-2 acceptance criteria unmet.
  2. Add a focused domain-only claim/lease lifecycle to TaskGraph.
    • Pros: satisfies phase-2 semantics without coupling the domain layer to workers, tmux, persistence, or CLI presentation.
    • Cons: durable event-envelope persistence still remains for later adapter slices.
  3. Wire claims into worker dispatch immediately.
    • Pros: closer to live runtime behavior.
    • Cons: expands beyond the issue’s explicit non-goals: no automatic worker launch and no tmux/worktree dispatch yet.

Selected approach

Selected option: 2 — focused domain-only claim/lease lifecycle.

Why this one: phase 2 is about the execution contract, not live worker dispatch. Keeping it in the domain layer gives later worker/runtime phases a stable surface for ownership semantics while preserving the phase boundary.

Risks/trade-offs: the token generator is deterministic and domain-local, not a cryptographic secret generator. Callers that need stronger entropy can still provide an explicit token. Event records are lightweight domain records, not persisted RuntimeEventEnvelopes.

Implementation by file/surface

  • src/domain/task-graph.ts
    • Adds ClaimLease and optional claim metadata (claimedAt, recoveredFromToken).
    • Adds createClaimLease() and deterministic createClaimLeaseToken().
    • Updates claimTask() to validate graph shape, claim only ready tasks, and reject any existing claim unless the explicit recovery path is used.
    • Adds renewClaimLease(), releaseClaim(), and recoverExpiredClaim().
    • Keeps completeTask() gated by a matching unexpired claim token and evidence refs.
    • Adds claim/lease runtime event records for created, renewed, expired, released, and recovered transitions.
  • test/task-graph.test.ts
    • Covers token generation, claim success, non-ready claim rejection, duplicate active ownership, renew, release, expired completion blocking, explicit stale recovery, and recovery events.
  • README.md
    • Aligns the graph-execution boundary with claim/lease ownership and stale-claim recovery.

Why this fixes it

Ready tasks now move into claimed ownership with a lease token. Pending, blocked, and completed tasks reject claims; duplicate claims reject; completion without a matching unexpired token rejects; expired leases block normal completion; and stale work can be recovered only via recoverExpiredClaim(). The domain also exposes transition event records for the phase-2 claim lifecycle.

QA / Verification

  • npm ci — pass; installed local dependencies because tsx was absent in this fresh worktree.
  • npm test -- test/task-graph.test.ts — pass; package script ran the full test/*.test.ts suite plus the explicit file argument (532 tests, 521 pass, 11 skipped).
  • npm run check — pass.
  • npm run check:unused — pass.
  • npm run quality:budgets — pass with configured non-failing code_smells warning.
  • npm run verify — pass (532 tests, 521 pass, 11 skipped; then check, unused check, quality budgets).
  • git diff --check — pass.

Anomalies observed

  • npm test -- test/task-graph.test.ts does not run only the named file because package.json defines tsx --test test/*.test.ts; it executed the full test suite.
  • Node emitted existing [DEP0205] module.register() deprecation warnings during tests.
  • An intermediate test run failed because the quality-budget tuned-env tests saw code_smells=61; I reduced one source-level smell from the new code and reran the gates successfully with code_smells=60.
  • npm run quality:budgets retains the existing warning: code_smells=60 vs target <=20; the command exits 0 under the configured budget policy.

Risks / Follow-up

  • Durable event-envelope persistence and worker-dispatch integration are intentionally not included in this phase-2 domain slice.
  • Token generation is deterministic for testability and domain purity; production callers may pass externally generated opaque tokens if needed.

kapi-agent review expectations and current-head merge gate

  • Current head: e9026e46dc51420052629d974823363ff6e7237a (full SHA available from PR metadata).
  • Changed-line count: 179 additions+deletions against origin/dev (168 insertions, 11 deletions), under the review-size gate.
  • kapi-agent should verify claim lifecycle semantics, explicit stale recovery, event records, and README alignment before merge.

@devkade
Copy link
Copy Markdown
Owner Author

devkade commented May 18, 2026

@kapi-agent review

Revision explanation for current head e9026e46dc51420052629d974823363ff6e7237a:

What changed:

  • Added phase-2 TaskGraph claim/lease lifecycle primitives: lease creation, deterministic token generation, claim, renew, release, stale recovery, and matching runtime event records.
  • Kept worker dispatch and durable event-envelope persistence out of scope per issue non-goals.
  • Added regression coverage for ready-only claims, duplicate ownership rejection, completion token/expiry gates, explicit expired-lease recovery, and claim/lease event records.
  • Updated README graph-execution boundary wording for claim/lease ownership and stale-claim recovery.

Why this closes the issue:

  • Ready tasks can be claimed by exactly one worker.
  • Pending/blocked/completed tasks reject claims.
  • Completion requires a matching unexpired claim token and evidence refs.
  • Expired leases block normal completion and recover only through recoverExpiredClaim().
  • Claim-created, lease-renewed, lease-expired, claim-released, and claim-recovered records are exposed as phase-2 domain events.

Verification:

  • npm ci — pass.
  • npm test -- test/task-graph.test.ts — pass; package script ran full suite (532 tests, 521 pass, 11 skipped).
  • npm run check — pass.
  • npm run check:unused — pass.
  • npm run quality:budgets — pass with configured non-failing code_smells warning.
  • npm run verify — pass.
  • git diff --check — pass.

@kapi-agent
Copy link
Copy Markdown

kapi-agent Bot commented May 18, 2026

kapi-agent review completed

The formal GitHub PR review and required check were submitted separately. This request comment is kept concise to avoid duplicating the full review body.

kapi-agent request lifecycle:

  • Status: COMPLETED
  • Head: e9026e46dc51
  • Formal review event: APPROVE
  • Review API status: 200
  • Required check API status: 201

Copy link
Copy Markdown

@kapi-agent kapi-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kapi-agent review

Verdict: APPROVE

Final approval summary

Review journey

This PR adds phase-2 TaskGraph claim/lease lifecycle support. The semantic scope is bounded to the domain layer plus focused tests and README boundary wording: no worker dispatch, tmux/worktree runtime wiring, or durable event-envelope persistence is included.

There was no prior kapi-agent review on this head. The author provided a current-head revision explanation, and the change stays within the configured size threshold.

What changed

  • src/domain/task-graph.ts
    • Adds ClaimLease, deterministic lease token creation, claim metadata, renew/release/recover operations, active-claim gating, and claim/lease runtime event shapes.
  • test/task-graph.test.ts
    • Adds regression coverage for deterministic tokens, non-ready claim rejection, duplicate claim rejection, renew/release, expired completion blocking, explicit recovery, and event records.
  • README.md
    • Updates graph-execution boundary wording to include claim/lease ownership, stale-claim recovery, and claim events.

Why this is correct

The implementation enforces the key ownership contract: only ready tasks can be claimed, existing claims cannot be overwritten by normal claim flow, completion requires a matching unexpired claim token, and expired work is recoverable only through recoverExpiredClaim(). The recovery path preserves lineage through recoveredFromToken and increments attempts, which supports later worker/runtime phases without coupling this domain slice to dispatch or persistence.

Evidence

  • Verifier gate: PASS — npm ci && npm run verify exited 0.
  • Size gate: PASS — 179 changed lines, below the 200-line semantic review threshold.
  • Revision-explanation status: not required, but found.
  • Ilchul harness gate: PASS — STANDARD risk, no blocking findings.
  • Inspected files: src/domain/task-graph.ts, test/task-graph.test.ts, README.md.

Remaining risks and approval rationale

Remaining risks are bounded and documented: durable event-envelope persistence and live worker dispatch are intentionally deferred, and deterministic tokens are not cryptographic secrets. These are acceptable for this domain-only phase because callers can provide explicit opaque tokens later, and the current contract is covered by focused tests plus the full verification suite.

Blocking issues

None.

Warnings / risks

  • src/domain/task-graph.ts: createClaimLeaseRuntimeEvent("claim.recovered", lease) can emit recoveredFromToken: "" if called with a non-recovery lease. This is not blocking for the current PR because recoverExpiredClaim() creates the expected metadata and tests cover the normal path, but a future hardening pass could fail closed for malformed recovered events.
  • Durable persistence and worker dispatch are still out of scope, so runtime consumers must not assume these domain events are automatically persisted or emitted by workers yet.

Suggestions

  • Consider adding a small test that recoverExpiredClaim() rejects a new lease expiry that is not after now, since that behavior is currently delegated through createClaimLease().
  • Consider making createClaimLeaseRuntimeEvent("claim.recovered", ...) require recoveredFromToken at runtime instead of defaulting to an empty string.

Looks good

  • The explicit recovery path cleanly separates stale-claim recovery from normal duplicate-claim rejection.
  • Completion is consistently gated through requireActiveClaim(), reducing duplicated token/expiry logic.
  • Tests cover the main lifecycle transitions and regression-sensitive acceptance criteria.
  • README boundary wording remains aligned with the actual implementation scope.

Verification notes

Verifier gate status: PASS — npm ci && npm run verify exited 0.
Size gate status: PASS — 179 changed lines below threshold.
Revision-explanation status: not required, found.
Ilchul review harness: PASS — STANDARD risk profile, no blocking findings.


Engine: pi

@devkade devkade merged commit ed7ed13 into dev May 18, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sub-roadmap: DAG runtime phase 2 — claim, lease, and stale ownership

1 participant