Skip to content

Latest commit

 

History

History
756 lines (621 loc) · 42.8 KB

File metadata and controls

756 lines (621 loc) · 42.8 KB

Oversight Business Scenarios

Oversight is a delivery platform where documents hold intent, tasks hold execution, threads hold live discussion, workflows hold business progress, planning objects hold timebox context, and approvals / gates control automation.

The scenarios below are written against that product model so the examples stay aligned with the actual system behavior.

Modeling Rule: Task or Document?

Use this default:

  • task for execution that needs an owner, workflow state, approvals, agent runs, and completion tracking
  • document for durable context that should be read, reused, cited, and maintained over time
  • linked document + task when the context should survive beyond the current execution loop

For software delivery, the common mapping is:

  • requirement / PRD: document + linked task
  • small bug: task
  • complex bug / incident: linked document + task
  • ADR / SOP / runbook / test strategy: document
  • implementation / fix / review / verification / rollout: task

Part A: User Stories (by Role)

A1. Project Manager (PM)

ID Story UI Component API Endpoint
A1-01 Create project with name and description AppSidebar → ProjectSetup POST /api/projects
A1-02 Add users and agents as project members ProjectSetup → Members POST /api/projects/:id/members
A1-03 Define custom workflow with states and transitions Workflow Editor POST /api/workflows
A1-04 Configure workflow automation rules (on_enter/agent_done triggers) Workflow Rules POST /api/workflows/:id/rules
A1-05 Create milestone with due date Project Settings POST /api/projects/:id/milestones
A1-06 Create cycle (sprint) with start/end dates Cycle Management POST /api/cycles
A1-07 Create task with title, priority, assignees, labels TaskCreateDialog POST /api/tasks
A1-08 Assign task to agent with executor/reviewer role TaskDetail → Assignees POST /api/tasks/:id/assign
A1-09 View task board grouped by workflow states Issues → Board GET /api/tasks?workflow_id=X
A1-10 Switch between task list / board / tree layouts Issues layout tabs GET /api/tasks?workflow_id=X
A1-11 Open collection documents in list or tree layout Knowledge GET /api/documents?project_id=X
A1-12 Transition task between states (drag-drop or click) TaskBoard / TaskDetail POST /api/tasks/:id/transition
A1-13 Inspect task decomposition in the task tree TaskTreeView GET /api/tasks?workflow_id=X
A1-14 Build a collection-scoped document tree Knowledge → Tree POST /api/documents with collection_id + parent_id
A1-15 Create collection for knowledge management Collection Settings POST /api/projects/:id/collections
A1-16 Set collection access control per member Collection Access PUT /api/collections/:id/access-grants/:member
A1-17 View inbox with agent activity feed Inbox GET /api/activities
A1-18 Review and approve/deny tool permission requests ToolApprovalsView POST /api/tool-permissions/:id/decision
A1-19 Monitor agent execution status on tasks TaskDetail → Run Log GET /api/tasks/:id/threads
A1-20 Create release tied to milestone Release Management POST /api/projects/:id/releases
A1-21 Filter tasks by status/priority/assignee/label IssuesView GET /api/tasks?status=X&assignee=Y
A1-22 Create document templates for consistent artifacts Template Management POST /api/templates

A2. Developer

ID Story UI Component API Endpoint
A2-01 View assigned tasks in board or list TaskBoard / IssuesView GET /api/tasks?assignee=user:me
A2-02 Add comments to tasks for discussion TaskDetail → Comments POST /api/tasks/:id/comments
A2-03 Create task relations (blocks, implements, analyzes) TaskDetail → Relations POST /api/tasks/:id/relations
A2-04 Attach documents/evidence to task TaskDetail → Attachments POST /api/tasks/:id/attachments
A2-05 Transition task status (in_progress→review→done) TaskDetail POST /api/tasks/:id/transition
A2-06 Dispatch agent to work on task TaskDetail → Run Agent POST /api/tasks/:id/run-agent
A2-07 Watch agent execution via SSE stream TaskDetail → Run Log GET /api/tasks/:id/agent-stream/:thread_id
A2-08 Pause/cancel/redirect running agent TaskDetail → Runtime Controls POST /v1/ai-sdk/threads/:id/interrupt
A2-09 Chat with agent directly Chat POST /v1/ai-sdk/threads/:id/runs
A2-10 View agent activity history and cost AgentConfig → Activity GET /api/agent-definitions/:id/activities
A2-11 Approve/deny agent tool calls ToolApprovalsView POST /api/tool-permissions/:id/decision

A3. AI Agent

ID Story UI Component API Endpoint
A3-01 Receive task assignment and execute (background) ProductPlugin → Phase::RunEnd
A3-02 Use tools to read/write documents (agent tools) query_collection, write_collection
A3-03 Create comments on tasks with findings (agent tools) POST /api/tasks/:id/comments
A3-04 Transition task status on completion (ProductPlugin) POST /api/tasks/:id/transition
A3-05 Request tool approval when gated (runtime suspension) ToolPermissionRequest created
A3-06 Auto-create documents from state templates (workflow rule) POST /api/documents via template
A3-07 Participate in ensemble (parallel/competition) (ensemble engine) POST /api/tasks/:id/ensemble
A3-08 Vote on ensemble results (ensemble engine) POST /api/ensembles/:id/vote
A3-09 Read/write from collections with access control (agent tools) list_collections, query_collection, write_collection
A3-10 Persist memory across runs (agent memory) GET /api/agent-definitions/:id/memory

A4. System Administrator

ID Story UI Component API Endpoint
A4-01 Create/configure agent definitions AgentEditorForm POST /api/agent-definitions
A4-02 Set agent tool permissions (always/gated) AgentEditorForm PUT /api/agent-definitions/:id (tool_exec_mode)
A4-03 Create agent team with ensemble config AgentConfig → Teams POST /api/teams
A4-04 Add/remove agents from teams AgentConfig → Teams POST /api/teams/:id/members
A4-05 Configure agent workspace (git, worktree) AgentConfig → Workspaces POST /api/workspaces
A4-06 Create scheduled agent runs (cron) AgentConfig → Schedules POST /api/schedules
A4-07 Trigger schedule immediately AgentConfig → Schedules POST /api/schedules/:id/run-now
A4-08 Configure resources (DB, API credentials) AgentConfig → Resources POST /api/resources
A4-09 Test remote agent connection (A2A) AgentConfig POST /api/agent-definitions/:id/test-connection
A4-10 Export/import/reset agent memory AgentConfig → Memory POST /api/agent-definitions/:id/export-memory
A4-11 Create/manage users User Management POST /api/users
A4-12 Monitor system health Health Check GET /health
A4-13 View all agent stats (runs, cost, tokens) AgentConfig → Stats GET /api/agent-definitions/:id/stats

Part B: End-to-End Flows

B1. Requirement to Delivery (Full Lifecycle)

PM creates requirement document → Creates linked task → Decomposes into subtasks →
Assigns users and agents → Agents execute → Review → Release

Steps:

  1. PM creates project and adds team members (users + agents)
  2. PM creates a planning or requirements knowledge base
  3. PM creates workflow with states: backlog → triage → design → in_progress → review → done
  4. PM writes a requirement document and creates a linked requirement task from it
  5. PM configures workflow rules:
    • on_enter:triage → dispatch triage agent
    • agent_done:triage → auto-transition to design
    • on_enter:design → auto-create design doc from template
    • on_enter:in_progress → dispatch coder agent
    • agent_done:coder → auto-transition to review
    • on_enter:review → dispatch reviewer agent
  6. PM transitions the task to triage state
  7. Triage agent auto-starts, analyzes requirement, adds comments
  8. Task auto-transitions to design, design doc auto-created from template
  9. Designer fills design doc, transitions to in_progress
  10. Coder agent starts, implements feature, creates artifacts
  11. Task auto-transitions to review
  12. Reviewer agent validates implementation, votes pass/fail
  13. PM transitions to done, creates release

What stays aligned:

  • the requirement remains durable as a document
  • the task remains the operational center of execution
  • comments, threads, runs, and outputs remain attached to that task lineage

API Chain:

POST /api/projects → POST /api/projects/:id/members →
POST /api/projects/:id/collections →
POST /api/workflows → POST /api/workflows/:id/rules →
POST /api/documents → POST /api/tasks (doc_id=...) → POST /api/tasks/:id/transition →
(agent runs) → GET /api/tasks/:id/threads →
GET /api/tasks?workflow_id=X → POST /api/projects/:id/releases

B1a. Workflow, Planning, and Execution Layering

Business state changes → Dispatch rules evaluate → Gates decide whether execution can start →
Agent runs or waits → Human approvals intervene only where needed

Steps:

  1. A task enters a workflow state such as in_progress
  2. Workflow rules or auto_run decide whether the state should trigger agent execution
  3. Dispatch gates check whether execution is allowed, for example active-cycle requirements
  4. If allowed, the agent starts and execution activity accumulates on the task
  5. If blocked, the task remains in the same business state while execution waits in a deferred status such as waiting_cycle
  6. Human approval may still be required for tool use, resource access, subtask acceptance, or deployment actions

Key point: Business progress, planning context, execution state, and approval gates are related but intentionally not collapsed into one status machine.

B2. Workflow-Driven Agent Automation

Task enters state → Rule triggers → Agent dispatched → Agent completes →
Rule fires → Comments created → Auto-transition → Next agent

Steps:

  1. Workflow has rules: on_enter:in_progress dispatches coder, agent_done:in_progress transitions to review
  2. User transitions task to in_progress
  3. System detects on_enter rule, dispatches coder agent
  4. Agent executes (uses tools, creates documents, adds comments)
  5. ProductPlugin detects Phase::RunEnd
  6. ProductPlugin updates task.agent_status to "completed"
  7. ProductPlugin evaluates agent_done rules
  8. Rule matches: creates comment from agent response, transitions task to review
  9. on_enter:review rule fires, dispatches reviewer agent
  10. Chain continues until terminal state reached

Key APIs:

POST /api/tasks/:id/transition → (triggers rule engine) →
POST /api/tasks/:id/run-agent → GET /api/tasks/:id/agent-stream/:thread →
(ProductPlugin) → POST /api/tasks/:id/comments → POST /api/tasks/:id/transition

B3. Multi-Agent Ensemble Execution

Team configured → Ensemble triggered → Agents run in parallel →
Agents vote → Best result selected → Merged back

Steps:

  1. Admin creates agent team with mode=competition, judge_strategy=vote
  2. Admin adds agents (coder-1, coder-2, coder-3) as team members
  3. User creates task and triggers ensemble run
  4. System dispatches all team agents in parallel on same task
  5. Each agent works independently (isolated worktrees)
  6. Agents complete, results collected
  7. Agents vote on each other's work (score + reasoning)
  8. System aggregates votes, selects best result
  9. Winning result merged into main branch

API Chain:

POST /api/teams → POST /api/teams/:id/members →
POST /api/tasks/:id/ensemble →
GET /api/ensembles/:id → POST /api/ensembles/:id/vote →
POST /api/ensembles/:id/select

B4. Knowledge Collection Lifecycle

Collection created → Access configured → Agent writes documents →
Agent queries knowledge → Knowledge accumulates over time

Steps:

  1. PM creates collection with when_to_read, when_to_write, doc_requirements
  2. PM sets default_access to "read", grants "read_write" to specific agents
  3. Agent runs from a task-bound thread and generates a design doc
  4. Agent calls list_collections to discover available collections
  5. Agent calls query_collection to check for duplicates
  6. Agent calls write_collection to store the document (system enforces doc_requirements)
  7. Later, another agent runs a new task
  8. Agent calls query_collection to find related prior work
  9. Agent uses prior knowledge to inform current task
  10. Knowledge compounds across iterations

API Chain:

POST /api/projects/:id/collections → PUT /api/collections/:id/access-grants/:member →
POST /api/collections/:id/documents → GET /api/collections/:id/documents →
(agent tools) list_collections → query_collection → write_collection

B5. Tool Approval Gate Flow

Agent requests tool → Execution suspended → Human reviews →
Approve/Deny → Execution resumes → Result recorded

Steps:

  1. Agent configured with tool_exec_mode: "gated"
  2. Agent dispatched on task, starts execution
  3. Agent calls a tool (e.g., shell command, file write)
  4. Awaken runtime suspends execution, creates ToolPermissionRequest
  5. Task agent_status set to "waiting"
  6. Request appears in ToolApprovalsView (pending status, 1s polling)
  7. Operator reviews tool name, arguments, agent context
  8. Operator clicks Approve/Deny/Cancel
  9. Decision sent back to runtime via POST /v1/runs/:id/decision
  10. Agent resumes (if approved) or handles rejection (if denied)
  11. outcome_status updated: succeeded/blocked/cancelled

API Chain:

POST /api/tasks/:id/run-agent → (suspension) →
GET /api/tool-permissions?status=pending →
POST /api/tool-permissions/:id/decision →
(runtime resumes)

B6. Collection Document Trees

Collection selected → Parent document → Child documents → Links to tasks →
Track progress and durable knowledge

Steps:

  1. PM creates or selects a collection (for example Product Planning, Test Cases, or Runbooks)
  2. PM creates a top-level document inside that collection
  3. PM adds child documents with parent_id
  4. Documents can use any team-defined labels or attrs
  5. Documents link to tasks via edges when needed
  6. Knowledge → Tree shows the expandable hierarchy for that collection
  7. PM can add child documents directly from the tree
  8. Agent can decompose tasks into subtasks via POST /api/tasks/:id/decompose

API Chain:

POST /api/documents (label=vision) →
POST /api/documents (label=feature, parent_id=vision_id) →
POST /api/edges (from=doc, to=task) →
GET /api/documents?labels=feature&parent_id=X →
GET /api/documents/:id/trace

B6a. Document-First Delivery Loop

Planning knowledge base created → Plan / SOP / requirement document written →
Execution task linked to the document → Task decomposed →
Users and agents execute from the task

Steps:

  1. User explicitly creates or chooses a planning knowledge base such as Plans & SOPs
  2. User writes a plan, SOP, runbook, or requirement document in that knowledge base
  3. User creates an execution task linked to the document as its primary context
  4. User opens the task and assigns the right users and agents
  5. User or agent requests decomposition when the document should be broken into executable subtasks
  6. Users and agents discuss, execute, review, and update progress from the task and its subtasks
  7. Conversations started from the task are bound to that task / project / workflow scope, and the originating principal stays traceable through the thread binding ledger
  8. The linked document remains the durable source of intent while the task system tracks execution and thread-produced documents can be routed back into knowledge bases

API/UI Chain:

Project Setup → Planning →
POST /api/projects/:id/collections →
POST /api/documents →
POST /api/tasks (doc_id=...) →
POST /api/tasks/:id/decompose →
POST /api/tasks/:id/run-agent

Recommended agent choice:

  • When a team wants a model-backed CLI agent for planning, decomposition, implementation, or test verification, use a Codex-backed definition such as codex_delegate or a worker/agent with agent_type=codex.

B7. Scheduled Agent Operations

Schedule configured → Cron fires → Agent dispatched →
Runs in background → Results recorded

Steps:

  1. Admin creates schedule: agent=monitor, cron=0 9 * * *, prompt="Check for regressions"
  2. Admin optionally triggers immediate run for testing
  3. Cron scheduler fires at configured time
  4. System creates background task, dispatches agent
  5. Agent runs, generates report
  6. Activity recorded with cost/token metrics
  7. Results visible in Inbox

API Chain:

POST /api/schedules → POST /api/schedules/:id/run-now →
(cron) → GET /api/schedules → GET /api/activities

B8. Deployment and Environment Control

Deployment task created → Environment resources linked → Approval gates checked →
Agent executes rollout → Evidence and follow-up notes written back

Steps:

  1. Admin links repos, secrets, environments, and runtime endpoints as project resources
  2. Team writes a deployment runbook document in a knowledge base
  3. Team creates a deployment task linked to that runbook
  4. Workflow automation or a human dispatches the deployment agent
  5. Resource policy checks whether sensitive environments require approval
  6. If approval is needed, the run pauses in the approval queue
  7. Once approved, the agent executes the rollout and records evidence
  8. Results, incident notes, and follow-up tasks remain attached to the deployment task

Part B-Extra: Development Practice Workflows

These are configurable workflow patterns that encode industry-standard development practices. They require no new mechanisms except WIP limits (Practice 7) and ensemble→rule integration (Practice 4 parallel option). All others work with the existing rule engine.

BX1. Trunk-Based Development + Feature Flags

Agent works in worktree → Review pass → Merge back to trunk → Cleanup

Configuration:

  • Agent isolation: "worktree" — each task gets its own git branch/worktree
  • Merge via dedicated agent (can resolve conflicts) OR built-in merge_worktree action (simpler, aborts on conflict)

Workflow:

States: backlog → coding → review → merging → done
Rules:
  on_enter:coding                                 → dispatch (coder, isolation=worktree)
  agent_done:coding                               → transition:review
  on_enter:review                                 → dispatch (reviewer)
  agent_done:review + output.review_decision=pass → transition:merging
  agent_done:review + output.review_decision=reject → comment:response;transition:coding
  on_enter:merging                                → dispatch (merge-agent)
  agent_done:merging                              → transition:done
  agent_error:merging                             → transition:review

Key: Short-lived branches (ard/{instance_id}), auto-merge on review pass, worktree cleanup on merge.

BX2. TDD / Test-First

Test-writer creates tests → Coder implements → Tests run → Pass/fail loop

Workflow (two-state approach, recommended):

States: backlog → test_writing → coding → test_run → review → done
Rules:
  on_enter:test_writing                         → dispatch (test-writer agent)
  agent_done:test_writing                       → transition:coding
  on_enter:coding                               → dispatch (coder agent)
  agent_done:coding                             → transition:test_run
  on_enter:test_run                             → dispatch (test-runner agent)
  agent_done:test_run + output.tests_passed=true  → transition:review
  agent_done:test_run + output.tests_passed=false → transition:coding

Key: Workflow rule ordering enforces test-first — coder cannot start until test-writer completes.

BX3. Continuous Code Review

Agent completes → Auto-dispatch reviewer → Pass/reject drives transition

Workflow:

review state:
  default_assignee: "agent:reviewer"
  assignee_type: "role:reviewer"

Rules:
  on_enter:review                                   → dispatch
  agent_done:review + output.review_decision=pass   → comment:response;transition:done
  agent_done:review + output.review_decision=reject → comment:response;transition:coding

Multi-reviewer variant: Use ensemble with mode: parallel, judge_strategy: vote for team review.

BX4. Shift-Left Quality Gates

Quality agents run (lint/typecheck/security) → All pass → Proceed

Option A — Sequential (works today):

States: ... → lint → typecheck → security_scan → coding
Rules:
  on_enter:lint                                  → dispatch (linter)
  agent_done:lint + output.passed=true           → transition:typecheck
  agent_done:lint + output.passed=false          → comment:response;transition:backlog
  on_enter:typecheck                             → dispatch (type-checker)
  agent_done:typecheck + output.passed=true      → transition:security_scan
  agent_done:typecheck + output.passed=false     → comment:response;transition:backlog
  on_enter:security_scan                         → dispatch (security-scanner)
  agent_done:security_scan + output.passed=true  → transition:coding
  agent_done:security_scan + output.passed=false → comment:response;transition:backlog

Option B — Parallel (requires ensemble→rule integration):

Team: quality-gate-team (linter, type-checker, security-scanner)
  mode: parallel, judge_strategy: consensus

Rules:
  on_enter:quality_gate → ensemble:quality-gate-team
  ensemble_done + all pass → transition:coding
  ensemble_done + any fail → transition:backlog

BX5. Requirements Traceability

Vision → Epic → Story → Requirement → Task → Agent execution → Audit trail

Already fully supported:

  • Collection document tree: Document.collection_id + Document.parent_id
  • Task linkage: Edge(from=requirement_doc, to=task, kind="implements")
  • Test linkage: Edge(from=task, to=test_task, kind="tested_by")
  • Agent audit: AgentActivity records who ran what, TaskComment captures outputs
  • Thread persistence: full prompt/reply history per task/agent pair
  • Collections: accumulate process assets with access control

BX6. Design Doc → Implementation (Google Style)

Enter design state → Auto-create doc from template → Designer fills → Human approves → Code

Workflow:

States: backlog → design → design_review → coding → review → done

design state:
  doc_template: "# Design: {{task.title}}\n\n## Problem\n\n## Approach\n\n## API Changes\n\n## Testing Strategy"
  doc_label: "design_doc"
  default_assignee: "agent:designer"

design_review state:
  assignee_type: "user"    # human approval gate

Rules:
  on_enter:design      → dispatch (designer agent fills template)
  agent_done:design    → transition:design_review
  # Human reviews and manually transitions to coding
  on_enter:coding      → dispatch (coder implements the design)

BX7. Kanban + WIP Limits

Board view with per-state WIP constraints → Block transitions when limit reached

Requires: WorkflowState.wip_limit field (new mechanism).

Configuration:

States:
  backlog:      wip_limit: null   # unlimited
  in_progress:  wip_limit: 3      # max 3 concurrent
  review:       wip_limit: 2      # max 2 in review
  done:         wip_limit: null   # unlimited

Behavior:

  • Manual transition returns 400 error when target state WIP limit reached
  • Rule-driven transition skipped (logged) when WIP limit reached
  • Board UI shows count vs. limit: In Progress (2/3)

Part C: Test Matrix

C1. Task Management

ID Scenario Precondition Steps Expected Result API E2E Exists
C1-01 Create task via UI Project selected Click "New task", fill title, click "Create Task" Task appears in list POST /api/tasks Yes
C1-02 Create task with priority/description Project selected Open dialog, fill all fields, select priority Task created with all fields POST /api/tasks Yes
C1-03 Create Task disabled without title Dialog open Leave title empty Create button disabled - Yes
C1-04 Cancel task creation Dialog open Fill title, click Cancel Dialog closes, no task created - Yes
C1-05 Task CRUD via API None POST create, GET read, PATCH update, DELETE Full lifecycle works /api/tasks Yes
C1-06 Task status transition Task exists POST transition to "in_progress" Status changes POST /api/tasks/:id/transition Yes
C1-07 Task comments CRUD Task exists POST comment, GET list, verify content Comment persisted /api/tasks/:id/comments Yes
C1-08 Task relations (blocks) Two tasks exist POST relation from→to, GET relations Relation created /api/tasks/:id/relations Yes
C1-09 Task assignment with roles Task + agent exist POST assign with executor/reviewer role Assignees updated POST /api/tasks/:id/assign No
C1-10 Task attachments CRUD Task + document exist POST attach, GET list, DELETE detach Attachment lifecycle /api/tasks/:id/attachments No
C1-11 Task filtering by status Tasks in various states GET tasks?status=in_progress Only matching tasks GET /api/tasks?status=X No
C1-12 Task filtering by assignee Tasks with assignees GET tasks?assignee=agent:coder Correct filter GET /api/tasks?assignee=X No
C1-13 Task filtering by label Tasks with labels GET tasks?label=urgent Label filter works GET /api/tasks?label=X No
C1-14 Delete task Task exists DELETE task, GET returns 404 Task removed DELETE /api/tasks/:id Yes
C1-15 Task with cycle assignment Cycle exists Create task with cycle_id Task linked to cycle POST /api/tasks No
C1-16 Task with milestone Milestone exists Create task with milestone_id Task linked to milestone POST /api/tasks No
C1-17 Task with recurring schedule None Create task with recurrence cron recurrence field set POST /api/tasks No

C2. Workflow Engine

ID Scenario Precondition Steps Expected Result API E2E Exists
C2-01 List seeded workflows Server running GET workflows bug, feature, requirement workflows GET /api/workflows Yes
C2-02 Workflow has states and transitions Workflow exists GET workflow detail States and transitions populated GET /api/workflows/:id Yes
C2-03 Board shows workflow columns Workflow exists Navigate to board view Columns match workflow states GET /api/tasks/board Yes
C2-04 Switch workflow changes columns 2+ workflows Select different workflow Board columns update UI test Yes
C2-05 Task appears in correct column Task created GET board, find task in expected column Task in initial state column GET /api/tasks/board Yes
C2-06 Workflow rules are retrievable Rules seeded GET workflow rules Rules with triggers returned GET /api/workflows/:id/rules Yes
C2-07 Create custom workflow None POST workflow with states/transitions Workflow created POST /api/workflows No
C2-08 Create workflow rule Workflow exists POST rule with trigger/action Rule created POST /api/workflows/:id/rules No
C2-09 Delete workflow rule Rule exists DELETE rule Rule removed DELETE /api/workflows/:wf/rules/:rule No
C2-10 Workflow view CRUD Workflow exists POST/GET/PUT/DELETE view Full lifecycle /api/workflows/:id/views Partial
C2-11 Workflow with doc_template on state Template exists Configure state with template_id Template linked PUT /api/workflows/:id No
C2-12 Workflow transition validation Task in backlog Attempt invalid transition Error returned POST /api/tasks/:id/transition No

C3. Agent Operations

ID Scenario Precondition Steps Expected Result API E2E Exists
C3-01 List seeded agents Server running GET agent-definitions 7+ agents returned GET /api/agent-definitions Yes
C3-02 Agent has required fields Agents seeded Check agent fields system_prompt, model, role, team GET /api/agent-definitions Yes
C3-03 Agent roles include developer/analyst Agents seeded Collect unique roles Expected roles present GET /api/agent-definitions Yes
C3-04 Create agent definition None POST with agent_id, display_name, type Agent created POST /api/agent-definitions No
C3-05 Update agent definition Agent exists PUT updated fields Fields updated PUT /api/agent-definitions/:id No
C3-06 Delete agent definition Agent exists DELETE agent Agent removed DELETE /api/agent-definitions/:id No
C3-07 Dispatch agent on task Task + agent exist POST run-agent Dispatches with thread_id POST /api/tasks/:id/run-agent No
C3-08 Agent execution SSE stream Agent running GET agent-stream Events received GET /api/tasks/:id/agent-stream/:thread No
C3-09 Agent activity recorded Agent completed GET activities Activity with status/cost GET /api/agent-definitions/:id/activities No
C3-10 Agent stats aggregation Activities exist GET stats Totals calculated GET /api/agent-definitions/:id/stats No
C3-11 Agent memory CRUD Agent exists GET/export/import/reset memory Memory operations work /api/agent-definitions/:id/memory No
C3-12 Test remote agent connection A2A agent exists POST test-connection Connection status POST /api/agent-definitions/:id/test-connection No
C3-13 Runtime control: pause Agent running (mocked) Click Pause button Pause request sent POST /v1/ai-sdk/threads/:id/interrupt Yes
C3-14 Runtime control: cancel Agent running (mocked) Click Cancel button Cancel request sent POST /v1/ai-sdk/threads/:id/cancel Yes
C3-15 Runtime control: redirect Agent running (mocked) Fill message, click Redirect Redirect sent POST /v1/ai-sdk/threads/:id/runs Yes

C4. Agent Teams & Ensemble

ID Scenario Precondition Steps Expected Result API E2E Exists
C4-01 Create agent team Project exists POST team with name/config Team created POST /api/teams Yes
C4-02 Add/remove team member Team exists POST member, DELETE member Members managed /api/teams/:id/members Yes
C4-03 List teams Teams exist GET teams All teams returned GET /api/teams Yes
C4-04 List ensembles None GET ensembles Array returned GET /api/ensembles Yes
C4-05 Trigger ensemble on task Team + task exist POST ensemble Ensemble created with dispatches POST /api/tasks/:id/ensemble No
C4-06 Ensemble vote submission Ensemble running POST vote with score/reasoning Vote recorded POST /api/ensembles/:id/vote No
C4-07 Ensemble result selection Votes complete POST select winner Winner selected POST /api/ensembles/:id/select No
C4-08 Cancel ensemble Ensemble running POST cancel Ensemble cancelled POST /api/ensembles/:id/cancel No

C5. Tool Approvals

ID Scenario Precondition Steps Expected Result API E2E Exists
C5-01 List pending approvals Harness agent dispatched Navigate to /approvals, find row Pending request visible GET /api/tool-permissions?status=pending Yes
C5-02 Approve tool request Pending request Click Approve Status→approved, outcome→succeeded POST /api/tool-permissions/:id/decision Yes
C5-03 Deny tool request Pending request Click Deny Status→denied, outcome→blocked POST /api/tool-permissions/:id/decision Yes
C5-04 Cancel tool request Pending request Click Cancel Status→cancelled, outcome→cancelled POST /api/tool-permissions/:id/cancel Yes

C6. Document & Knowledge Management

ID Scenario Precondition Steps Expected Result API E2E Exists
C6-01 Document CRUD None POST create, GET read, DELETE Lifecycle works /api/documents Yes
C6-02 Document labels and attrs Document exists POST label, POST attr, GET verify Metadata persisted /api/documents/:id/labels, /attrs Yes
C6-03 Document edges (relationships) 2 documents exist POST edge, GET edges, trace Graph operations work /api/edges, /api/documents/:id/edges Yes
C6-04 Document with RE labels None POST with labels=["feature"] Label assigned POST /api/documents Yes
C6-05 Document parent-child hierarchy Parent exists POST with parent_id Hierarchy formed POST /api/documents Yes
C6-06 Document trace traversal Edge chain exists GET trace with max_depth Traversal returns nodes GET /api/documents/:id/trace Yes
C6-07 Document search by keyword Documents exist GET documents?keyword=X Matching docs returned GET /api/documents?keyword=X No
C6-08 Document search by labels Documents exist GET documents?labels=feature Filtered results GET /api/documents?labels=X No
C6-09 Document tree UI shows hierarchy Collection has hierarchical docs Navigate to Knowledge → Tree Tree rendered UI test Partial

C7. Collections

ID Scenario Precondition Steps Expected Result API E2E Exists
C7-01 Create collection Project exists POST collection with guidance fields Collection created POST /api/projects/:id/collections No
C7-02 List collections Collections exist GET collections All returned GET /api/projects/:id/collections No
C7-03 Update collection Collection exists PUT updated fields Fields updated PUT /api/collections/:id No
C7-04 Delete collection Collection exists DELETE Collection removed DELETE /api/collections/:id No
C7-05 Set access grant Collection exists PUT access for member Grant created PUT /api/collections/:id/access-grants/:m No
C7-06 Remove access grant Grant exists DELETE grant Grant removed DELETE /api/collections/:id/access-grants/:m No
C7-07 Create document in collection Collection exists POST document to collection Document linked POST /api/collections/:id/documents No
C7-08 List collection documents Docs in collection GET collection docs Documents returned GET /api/collections/:id/documents No

C8. Project Management

ID Scenario Precondition Steps Expected Result API E2E Exists
C8-01 Project CRUD None POST/GET/DELETE project Lifecycle works /api/projects Yes
C8-02 Project members CRUD Project exists POST add, GET list, DELETE remove Members managed /api/projects/:id/members Yes
C8-03 Milestone CRUD Project exists POST/GET/PUT/DELETE milestone Lifecycle works /api/projects/:id/milestones No
C8-04 Release CRUD Project exists POST/GET/PUT/DELETE release Lifecycle works /api/projects/:id/releases No
C8-05 Cycle CRUD None POST/GET/PUT/DELETE cycle Lifecycle works /api/cycles Yes
C8-06 Template CRUD None POST/GET/PUT/DELETE template Lifecycle works /api/templates No
C8-07 User CRUD None POST/GET/PUT/DELETE user Lifecycle works /api/users No
C8-08 User activity history User exists GET user activity Activities returned GET /api/users/:id/activity No

C9. Workspace & Resource Management

ID Scenario Precondition Steps Expected Result API E2E Exists
C9-01 Workspace CRUD Agent exists POST/GET/PUT/DELETE workspace Lifecycle works /api/workspaces No
C9-02 Resource CRUD None POST/GET/PUT/DELETE resource Lifecycle works /api/resources No
C9-03 Schedule CRUD Agent exists POST/GET/PUT/DELETE schedule Lifecycle works /api/schedules No
C9-04 Trigger schedule immediately Schedule exists POST run-now Execution started POST /api/schedules/:id/run-now No

C10. System & Platform

ID Scenario Precondition Steps Expected Result API E2E Exists
C10-01 Health endpoint Server running GET /health status=healthy, components ok GET /health Yes
C10-02 Liveness probe Server running GET /health/live 200 OK GET /health/live Yes
C10-03 Thread create and retrieve None POST/GET thread Thread lifecycle /v1/threads Yes
C10-04 Runs list None GET runs Structure returned GET /v1/runs Yes
C10-05 Stats endpoint None GET stats Counts returned GET /api/stats Yes
C10-06 Meta: list tools None GET meta/tools Tool list GET /api/meta/tools Yes
C10-07 Meta: roles and teams None GET meta/roles Roles/teams GET /api/meta/roles Yes
C10-08 404 for nonexistent task None GET invalid ID 404 GET /api/tasks/invalid Yes
C10-09 400 for invalid input None POST task without title 400+ error POST /api/tasks Yes

C11. UI Navigation & Interaction

ID Scenario Precondition Steps Expected Result API E2E Exists
C11-01 Board nav button active state None Navigate to board Board button highlighted - Yes
C11-02 Docs nav loads collection documents None Click Docs List layout rendered; tree available per collection - Yes
C11-03 Project selector works Projects exist Select project View updates - Yes
C11-04 Workflow selector works Workflows exist Select workflow Heading changes - Yes
C11-05 Task detail opens on click Task exists Click task row Detail panel opens - Yes
C11-06 Close dialog with X button Dialog open Click X Dialog closes - Yes

Coverage Summary

Domain Total Scenarios With E2E Gap
C1. Task Management 17 8 9
C2. Workflow Engine 12 6 6
C3. Agent Operations 15 5 10
C4. Agent Teams & Ensemble 8 4 4
C5. Tool Approvals 4 4 0
C6. Document & Knowledge 9 6 3
C7. Collections 8 0 8
C8. Project Management 8 3 5
C9. Workspace & Resource 4 0 4
C10. System & Platform 9 9 0
C11. UI Navigation 6 6 0
Total 100 51 49

Part X: End-to-End Scenario — Workflow → Tasks → Knowledge

This scenario chains the five core capabilities (workflow, task, task dependency, collection access, document) into a single flow. Each step lists the UI path, REST endpoint, and agent tool ID so PM / QA / agents share one reference. Cross-references: A1-03, A1-07, A2-03, A1-15, A1-16, A1-14.

X1. Human-driven flow (web UI)

Step Action UI REST API
1 Create workflow with states & transitions Settings → Workflows → "New Workflow" dialog POST /api/workflows
2 Create first task in workflow Issues → TaskCreateDialog POST /api/tasks
3 Create second task and add blocks relation to first TaskDetail → Relations panel → "Add relation" POST /api/tasks/:id/relations
4 Create a collection to hold task artifacts ProjectSetup → Collections tab → "New Collection" POST /api/projects/:id/collections
5 Grant a member read_write access to the collection ProjectSetup → Collections tab → Access grants row PUT /api/collections/:id/access-grants/:member
6 Create a document inside the collection Knowledge → Tree → "New document" POST /api/documents (with collection_id)

X2. Agent-driven flow (same outcome via tools)

An agent with sufficient tool permissions can execute the entire scenario without human UI interaction:

Step Agent tool Notes
1 create_workflow Requires project context; returns workflow ID used by step 2
2 create_task Binds task to the workflow from step 1; returns first task ID
3 add_task_relation relation_type = blocks; source is the new task, target is the task from step 2
4 create_collection Creates the collection directly from the agent flow; the project must already exist
5 set_collection_access / remove_collection_access new — closes the gap so agents can manage collection members; backed by POST/DELETE /api/collections/:id/access-grants/:member_id
6 write_collection Creates document inside collection; requires prior query_collection call on the same collection (read-before-write gate enforced by CollectionReadTracker)

X3. Coverage matrix

Capability REST Frontend UI Agent tool E2E test
Create workflow Yes Yes — Settings → Workflows Yes — create_workflow workflow-advanced.spec.ts
Create task + dependency Yes Yes — TaskCreateDialog / TaskDetail Yes — create_task, add_task_relation task-crud.spec.ts, task-advanced.spec.ts
Create collection Yes Yes — ProjectSetup Yes — create_collection collections.spec.ts, collection_tool_tests.rs
Define document users Yes Yes — ProjectSetup (limited) Yes — set_collection_access, remove_collection_access (new) collections.spec.ts
Create document in collection Yes Yes — Knowledge → Tree Yes — write_collection (gated by query_collection) collections.spec.ts

X4. Known gap

Collection creation is now exposed as create_collection, but collection writes still enforce a read-before-write discipline: agents must call query_collection or read_collection_doc on the target collection before write_collection / edit_collection_doc.