apflow v2 — Product Requirements Document

Positioning Update: The product has been repositioned from "AI Agent Production Middleware" to "AI-Perceivable Distributed Orchestration" (AP+Flow). apflow is a distributed task orchestration engine where every capability is AI-perceivable via the apcore module standard. This PRD retains original requirements and analysis. See README.md for current positioning.

Product: apflow v2 — AI-Perceivable Distributed Orchestration Version: 0.20.0 (MVP) Author: apflow team Date: 2026-03-28 Status: Draft

1. Executive Summary

One-line positioning: apflow v2 is framework-agnostic production middleware that makes AI agents reliable, cost-governed, and auditable — without replacing your agent framework.

The problem: 79% of enterprises are adopting AI agents, but only 11% reach production. The gap is not in building agents — dozens of mature frameworks handle that — but in making them production-grade: durable execution, cost control, and governance.

The solution: apflow v2 sits between any agent framework and production infrastructure, providing durable execution (checkpoint/resume), cost governance (token budgets, model downgrade chains), and a unified module system via the apcore ecosystem. It does not compete with LangGraph, CrewAI, or any agent framework — it makes them production-ready.

Key differentiator: The apcore ecosystem (apcore, apcore-mcp, apcore-a2a, apcore-cli) — all stable, all available — gives apflow automatic protocol exposure (MCP, A2A) and CLI generation without building any of it from scratch. No competitor in the middleware space has this.

2. Problem Statement

Current Pain Points

The Demo-to-Production Gap. Building an AI agent demo takes days. Getting it to production takes months. The missing pieces are not intelligence but reliability: what happens when an agent fails mid-execution? What happens when token costs spike? What happens when you need an audit trail?
Framework Lock-in. LangGraph offers checkpointing — but only for LangGraph agents. CrewAI has its own execution model. OpenAI Agents SDK has its own. There is no framework-agnostic durable execution layer.
Cost Unpredictability. Token budgets burn millions in enterprise settings. A single agentic loop with no cost guard can exhaust a monthly budget in hours. There is no mature, framework-agnostic cost governance solution.
Protocol Fragmentation. MCP (97M monthly SDK downloads) and A2A (100+ enterprise supporters) are both under the Linux Foundation. But integrating agents with these protocols requires significant custom wiring.

Why Existing Solutions Don't Solve This

Category	Examples	Gap
Agent Frameworks	LangGraph, CrewAI, OpenAI Agents SDK, Claude Agent SDK, Google ADK	Build agents, don't solve production reliability across frameworks
LLM Routing	LiteLLM, Portkey, Bifrost	Route API calls, don't manage agent execution lifecycle
Observability	Langfuse, LangSmith, AgentOps	Observe agents, don't control or recover them
Visual Orchestration	Dify, n8n, Flowise	Low-code builders, not middleware for existing code
Code Sandboxes	E2B, Modal, Daytona	Isolate execution, don't manage agent state or cost

What Happens If We Don't Build This

apflow v1 remains a task orchestration framework competing against massively funded platforms (Dify at 114k stars, n8n at 150k stars). It has no meaningful differentiation. The codebase carries ~17,500 lines of self-built protocol and CLI code that duplicates what apcore already does better.

3. Target Users

Primary Audience

Developers who are building AI agents with any framework and need to move them to production.

User Personas

Persona 1: The AI Application Developer ("Alex")

Building agentic features in a product (customer support bot, code assistant, data pipeline)
Uses LangGraph or CrewAI or OpenAI Agents SDK
Needs: reliable execution (retries, checkpoints), cost control, easy integration
Pain: agents fail silently, costs are unpredictable, no audit trail

Persona 2: The AI Platform Engineer ("Jordan")

Responsible for the internal AI platform at a mid-size company
Multiple teams use different agent frameworks
Needs: unified governance, cost policies across teams, protocol compliance (MCP/A2A)
Pain: each team reinvents reliability, no unified cost reporting

Persona 3: The Independent Developer ("Sam")

Solo or small team building AI-powered tools
Cannot afford to build production infrastructure from scratch
Needs: lightweight middleware that "just works," quick protocol exposure via MCP/A2A
Pain: wants to ship fast, not build infra

Jobs to Be Done

"I want my agent to recover from failures without losing progress."
"I want to set a token budget and know it will be enforced."
"I want my existing agent code exposed via MCP and A2A without rewriting anything."
"I want to see how much each agent run costs and who triggered it."

4. Product Vision & Positioning

Vision Statement

Make any AI agent production-ready — regardless of which framework built it — by providing the reliability, governance, and protocol layers that every agent needs but no framework provides.

What apflow IS and IS NOT

apflow IS	apflow IS NOT
Production middleware for AI agents	An agent framework (use LangGraph/CrewAI/etc.)
Framework-agnostic reliability layer	A replacement for your existing stack
Cost governance and budget enforcement	An LLM routing layer (use LiteLLM/Portkey)
A bridge to MCP and A2A protocols	An observability platform (use Langfuse)
Built on the apcore module standard	A visual workflow builder

Relationship to the apcore Ecosystem

+------------------------------------------------------------------+
|                        Protocol Layer                             |
|   apcore-mcp (MCP server)  |  apcore-a2a (A2A)  |  apcore-cli   |
+------------------------------------------------------------------+
|                        apcore (Module Standard)                   |
|   Registry  |  Executor  |  ACL  |  Middleware  |  Observability  |
+------------------------------------------------------------------+
|                        apflow v2 (This Product)                   |
|   Durable Execution  |  Cost Governance  |  Task Orchestration    |
+------------------------------------------------------------------+
|                        Agent Frameworks (Bring Your Own)          |
|   LangGraph  |  CrewAI  |  OpenAI SDK  |  Custom  |  Any         |
+------------------------------------------------------------------+

apflow v2 registers its capabilities as apcore Modules. The apcore ecosystem automatically exposes them via MCP, A2A, and CLI. apflow does not build or maintain any protocol servers.

Competitive Differentiation

Capability	apflow v2	LangGraph	Temporal + LLM	Custom build
Framework-agnostic	Yes	No (LangGraph only)	Partial	N/A
Durable execution	Yes	Yes (LangGraph only)	Yes (complex setup)	DIY
Cost governance	Yes (built-in)	No	No	DIY
MCP/A2A protocol exposure	Yes (via apcore)	No	No	DIY
CLI auto-generation	Yes (via apcore-cli)	No	No	DIY
Setup complexity	pip install	pip install	Significant infra	Months

5. Feature Requirements

P0 — MVP (v0.20.0): Must Have

F-001: Project Slimming

Description: Remove all self-built protocol layers, CLI, and extensions that are superseded by the apcore ecosystem or out of scope for v2.

What to delete (~19,938 lines, ~76 files):

Module	Lines	Files	Replacement
`api/a2a/`	~1,840	7	apcore-a2a
`api/mcp/`	~933	8	apcore-mcp
`api/graphql/`	~646	9	None (dropped)
`api/docs/`	~727	3	apcore-mcp Tool Explorer
`cli/`	~6,979	18	apcore-cli
`extensions/crewai/`	~1,747	4	Future: thin agent adapter
`extensions/llm/`	~251	2	Future: agent adapter
`extensions/llm_key_config/`	~158	2	Dropped (dead — no importers)
`extensions/generate/`	~3,651	8	Dropped (code generation)
`extensions/grpc/`	~314	2	Dropped
`extensions/tools/`	~473	3	Dropped
`extensions/ssh/`	~342	2	Dropped (out of core scope)
`extensions/docker/`	~394	2	Dropped (out of core scope)
`extensions/scrape/`	~117	2	Dropped (out of core scope)
`extensions/websocket/`	~277	2	Dropped (out of core scope)
`extensions/mcp/`	~523	2	Dropped (apcore-mcp replaces client usage)
`extensions/stdio/`	~566	2	Dropped (out of core scope)

What to update:

pyproject.toml: remove deleted extras (a2a, cli, graphql, crewai, llm, grpc, tools, standard, all); add apcore>=0.14.0, apcore-mcp>=0.10.1, apcore-a2a, apcore-cli>=0.3.0 as dependencies or extras
__init__.py: update description to "AI Agent Production Middleware", version to 0.20.0
Remove project.scripts entries (apflow, apflow-server)

Acceptance criteria:

All listed directories are deleted from the source tree.
pyproject.toml has no references to deleted modules.
pip install apflow succeeds with only core + apcore dependencies.
Remaining test suite passes (tests for deleted modules are also removed).
Total source line count drops from ~48.5k to ~25.2k.

Priority: P0 Estimated effort: 3-5 days

F-002: apcore Module Bridge

Description: Register apflow's capabilities as apcore Modules so that apcore-mcp, apcore-a2a, and apcore-cli can automatically expose them.

Requirements:

Create bridge/ module with apcore registration logic.
Adapt ExecutableTask to implement the apcore Module interface (schema-enforced input/output, metadata).
Register built-in executors as apcore Modules:
- REST executor -> apflow.rest module
- Email executor -> apflow.email module
- AggregateResults executor -> apflow.aggregate_results module
- ApFlowApi executor -> apflow.apflow_api_executor module
Register TaskRoutes handlers (create, execute, list, get, delete) as apcore Modules.
Initialize apcore Registry on apflow startup.

Acceptance criteria:

apcore-mcp can start and expose all registered modules as MCP tools.
apcore-a2a can start and expose registered modules as A2A skills.
apcore-cli can generate CLI commands for registered modules.
Each module has proper JSON Schema for inputs and outputs.
Module registration is declarative and requires no code changes in executors.

Priority: P0 Estimated effort: 5-8 days

F-003: Durable Agent Execution

Description: Enable checkpoint/resume for long-running agent tasks so that failures do not lose progress.

Requirements:

TaskModel extensions:

checkpoint_at: Optional[datetime] — when the last checkpoint was saved
resume_from: Optional[str] — checkpoint ID to resume from
attempt_count: int — number of execution attempts (default 0)
max_attempts: int — maximum retry attempts (default 3)
backoff_strategy: Optional[str] — "fixed", "exponential", "linear" (default "exponential")
backoff_base_seconds: float — base delay between retries (default 1.0)

ExecutableTask interface extensions:

supports_checkpoint() -> bool — whether this task type supports checkpointing
get_checkpoint() -> Optional[dict] — serialize current state
resume_from_checkpoint(checkpoint: dict) -> None — restore state from checkpoint

Checkpoint storage:

Checkpoints stored in a separate task_checkpoints table (not on TaskModel) — allows large checkpoints, independent lifecycle, and full checkpoint history
Database migration 004 for new fields and table
Checkpoint data is JSON-serializable; binary data stored as base64

Retry logic:

On task failure, check attempt_count < max_attempts
Apply backoff strategy to calculate delay
Save checkpoint before retry if task supports it
Resume from checkpoint on retry if available

Circuit breaker:

Strengthen existing timeout capabilities
Add circuit_breaker_threshold: int — consecutive failures before circuit opens (default 5)
Add circuit_breaker_reset_seconds: float — time before circuit half-opens (default 60.0)

Acceptance criteria:

A task that fails mid-execution can be retried and resumes from its last checkpoint.
Retry backoff follows the configured strategy (fixed, exponential, linear).
Circuit breaker prevents repeated execution of consistently failing tasks.
Checkpoint data persists across process restarts.
Database migration 004 applies cleanly on both SQLite and PostgreSQL.
Tasks that do not support checkpointing still benefit from retry logic (they restart from the beginning).

Priority: P0 Estimated effort: 8-12 days

F-004: Cost Governance

Description: Provide token budget management and cost policy enforcement for AI agent tasks.

Requirements:

TaskModel extensions:

token_usage: Optional[dict] — {"input": int, "output": int, "total": int}
token_budget: Optional[int] — maximum tokens allowed for this task
estimated_cost_usd: Optional[float] — pre-execution cost estimate
actual_cost_usd: Optional[float] — post-execution actual cost
cost_policy: Optional[str] — policy name to apply

New module: governance/

governance/budget.py:

TokenBudget dataclass: limit, used, remaining, scope (task/user/global)
BudgetManager: track usage, check limits, aggregate across scopes
Per-task budgets checked before and during execution

governance/policy.py:

CostPolicy dataclass: name, action, threshold, downgrade_chain
Actions: block (reject task), downgrade (use cheaper model), notify (log warning, continue), continue (no action)
PolicyEngine: evaluate policies against current usage

governance/provider_router.py:

Model downgrade chains: configurable sequences (e.g., claude-opus-4-20250514 -> claude-sonnet-4-20250514 -> claude-haiku-4-20250514)
When budget threshold is hit and policy is downgrade, automatically select next model in chain
Provider-agnostic: works with any model identifier string

governance/reporter.py:

Usage summary by task, user, time period
Export as JSON for integration with external dashboards
No built-in UI (use apcore-mcp Tool Explorer or external tools)

Acceptance criteria:

A task with token_budget=1000 is blocked (or downgraded) when usage approaches the limit.
Cost policies can be defined in configuration and applied by name.
Model downgrade chains work: when budget triggers downgrade, the next model in the chain is used.
Token usage is recorded on every task that involves LLM calls.
Usage reports can be generated for any time period.
All cost data persists in the database.

Priority: P0 Estimated effort: 8-10 days

F-005: TaskCreator Relaxation

Description: Remove the single-root constraint from TaskCreator, allowing multi-root task forests.

Requirements:

parent_id becomes truly optional (no default root creation)
Multiple independent task trees can coexist
All existing validation preserved: circular dependency detection, reference validation, duplicate ID detection
No changes to TaskManager execution logic (it already handles independent tasks)

Acceptance criteria:

Tasks can be created without parent_id and exist as independent roots.
Multiple root tasks can coexist in the same session.
Circular dependency detection still works correctly.
Existing task creation patterns (with parent_id) continue to work unchanged.

Priority: P0 Estimated effort: 1-2 days

P1 — v0.21.0: Should Have

F-006: Agent Adapters

Description: Provide thin adapter layers that wrap popular agent frameworks as apflow executable tasks.

Requirements:

AgentAdapter Protocol: async execute(input: dict) -> dict, supports_checkpoint() -> bool, get_checkpoint() -> Optional[dict]
GenericAdapter: wraps any async callable as an agent adapter
LangGraphAdapter: wraps a LangGraph compiled graph, maps LangGraph checkpoints to apflow checkpoints
CrewAIAdapter: wraps a CrewAI crew (thin, not the old heavy extension)
OpenAIAgentsAdapter: wraps OpenAI Agents SDK runner

Acceptance criteria:

Each adapter can execute its target framework's agent as an apflow task.
LangGraph adapter preserves LangGraph's native checkpoint data.
All adapters are optional dependencies (not required for core apflow).
Each adapter is <300 lines of code.

Priority: P1 Estimated effort: 5-8 days per adapter

F-007: TaskModel Agent Extensions

Description: Add agent-specific metadata fields to TaskModel.

Requirements:

agent_framework: Optional[str] — which framework produced this task ("langgraph", "crewai", "openai", etc.)
agent_model: Optional[str] — the LLM model used
agent_config: Optional[dict] — framework-specific configuration

Acceptance criteria:

Fields are optional and do not affect existing task behavior.
Database migration applies cleanly.
Fields are exposed via apcore module metadata.

Priority: P1 Estimated effort: 1-2 days

F-008: Observability Integration

Description: Integrate with Langfuse and OpenTelemetry for cost and usage reporting.

Requirements:

Langfuse integration: export cost/usage data from governance module to Langfuse traces
OpenTelemetry: export execution spans with task metadata
Both integrations are optional dependencies

Acceptance criteria:

Langfuse receives cost data for every LLM-involving task execution.
OpenTelemetry spans include task ID, execution time, token usage, and cost.
Neither integration is required for apflow to function.

Priority: P1 Estimated effort: 3-5 days

P2 — Future: Nice to Have

F-009: Advanced Cost Strategies

Description: Predictive cost analysis and cross-provider comparison.

Requirements:

Pre-execution cost prediction based on input size and model pricing
Cross-provider cost comparison (same prompt, different providers)
Historical cost analytics with trend detection

Acceptance criteria:

Cost predictions are within 20% of actual cost for standard workloads.
Historical analytics are queryable by time range, user, and task type.

Priority: P2 Estimated effort: 5-8 days

F-010: Multi-Agent Coordination

Description: Orchestrate agents across frameworks using A2A protocol.

Requirements:

Cross-framework agent orchestration: a LangGraph agent can delegate to a CrewAI agent via A2A
Agent result aggregation: combine outputs from multiple agents
Verification: cross-check agent outputs for consistency

Acceptance criteria:

Two agents from different frameworks can collaborate on a task via A2A.
Results from multiple agents can be aggregated into a single output.

Priority: P2 Estimated effort: 10-15 days

6. Technical Architecture

Architecture Layers

Layer 4: Protocol Exposure (apcore ecosystem — NOT built by apflow)
  ┌─────────────┬─────────────┬─────────────┐
  │ apcore-mcp  │ apcore-a2a  │ apcore-cli  │
  └─────────────┴─────────────┴─────────────┘

Layer 3: Module Standard (apcore — NOT built by apflow)
  ┌─────────────────────────────────────────────┐
  │ Registry │ Executor │ ACL │ Middleware │ Obs │
  └─────────────────────────────────────────────┘

Layer 2: apflow v2 (THIS PRODUCT)
  ┌──────────────┬──────────────┬───────────────┐
  │   Durable    │     Cost     │    Module      │
  │  Execution   │  Governance  │    Bridge      │
  ├──────────────┴──────────────┴───────────────┤
  │         Task Orchestration Engine            │
  │  TaskManager │ TaskCreator │ TaskExecutor    │
  ├─────────────────────────────────────────────┤
  │   Executors (REST, Email, Aggregate, Api)    │
  ├─────────────────────────────────────────────┤
  │         Storage (SQLite / PostgreSQL)         │
  └─────────────────────────────────────────────┘

Layer 1: Agent Frameworks (USER'S CHOICE — not built by apflow)
  ┌───────────┬─────────┬──────────┬────────────┐
  │ LangGraph │ CrewAI  │ OpenAI   │ Any async  │
  │           │         │ Agents   │ callable   │
  └───────────┴─────────┴──────────┴────────────┘

Code Inventory

What is deleted (~19,938 lines):

Module	Lines	Reason
`api/a2a/`	1,840	Replaced by apcore-a2a
`api/mcp/`	933	Replaced by apcore-mcp
`api/graphql/`	646	Dropped
`api/docs/`	727	Replaced by apcore-mcp Tool Explorer
`cli/`	6,979	Replaced by apcore-cli
`extensions/crewai/`	1,747	Replaced by thin adapter (P1)
`extensions/llm/`	251	Replaced by agent adapter (P1)
`extensions/llm_key_config/`	158	Dropped (dead — no importers)
`extensions/generate/`	3,651	Dropped (code generation out of scope)
`extensions/grpc/`	314	Dropped
`extensions/tools/`	473	Dropped
`extensions/ssh/`	342	Dropped (out of core scope)
`extensions/docker/`	394	Dropped (out of core scope)
`extensions/scrape/`	117	Dropped (out of core scope)
`extensions/websocket/`	277	Dropped (out of core scope)
`extensions/mcp/`	523	Dropped (apcore-mcp replaces client usage)
`extensions/stdio/`	566	Dropped (out of core scope)

What is preserved (~25,249 lines):

Module	Lines	Notes
`core/`	18,319	Task engine, models, executors, distributed runtime — the heart of apflow
`scheduler/`	2,028	Task scheduling
`extensions/http/`	346	HTTP/REST executor
`extensions/email/`	425	Email executor
`extensions/apflow/`	651	ApFlowApi executor
`extensions/core/`	234	AggregateResults executor
`extensions/hooks/`	153	Hook system
`extensions/storage/`	162	Storage extension
Other files	~2,885	`adapters/`, `app.py`, `bridge/`, `cli.py`, `durability/`, `governance/`, `logger.py`, etc.

What is new (estimated ~3,000-5,000 lines):

Module	Est. Lines	Description
`bridge/`	500-800	apcore Module registration
`governance/`	800-1,200	Budget, policy, provider router, reporter
`durability/`	600-1,000	Checkpoint, retry, circuit breaker logic
TaskModel extensions	200-300	New fields and migration
Tests	1,000-2,000	Unit tests for new modules

Dependency Changes

Removed from pyproject.toml:

a2a-sdk, fastapi, uvicorn, starlette, websockets (from a2a extra)
click, rich, typer, python-dotenv, nest_asyncio (from cli extra)
strawberry-graphql (from graphql extra)
crewai, litellm, anthropic (from crewai extra)
grpclib, protobuf (from grpc extra)

Added to pyproject.toml:

apcore>=0.14.0 (core dependency)
apcore-mcp>=0.10.1 (optional: MCP exposure)
apcore-a2a (optional: A2A exposure)
apcore-cli>=0.3.0 (optional: CLI generation)

7. Non-Functional Requirements

Requirement	Target	Measurement
Execution overhead	<50ms per task dispatch	Benchmark: 1000 no-op tasks
Checkpoint recovery	>95% success rate	Test: kill/resume 100 checkpointed tasks
Schema migrations	Backward compatible	Migration runs on existing v1 databases without data loss
Python version	>=3.11	Aligned with apcore requirement
Test coverage	>=90% on new code	`pytest --cov` on `bridge/`, `governance/`, `durability/`
Existing test adaptation	All preserved modules pass	Adapted tests for deleted imports/dependencies
Package size	Core install <5MB	`pip install apflow` without optional extras

8. Risks & Mitigations

Technical Risks

Risk	Likelihood	Impact	Mitigation
apcore Module interface does not fit all executor patterns	Medium	High	Prototype bridge with 2-3 executors first; negotiate interface changes with apcore if needed
Checkpoint serialization complexity (non-JSON-serializable state)	Medium	Medium	Restrict checkpoints to JSON-serializable data; provide base64 escape hatch for binary
SQLite migration compatibility	Low	High	Test migration 004 on both SQLite and PostgreSQL before release
Performance regression from apcore middleware pipeline	Low	Medium	Benchmark with and without apcore; bypass middleware for internal execution paths if needed

Product Risks

Risk	Likelihood	Impact	Mitigation
"Production middleware" positioning is too abstract for developers	Medium	High	Lead with concrete use cases: "add retries and cost limits to your LangGraph agent in 5 minutes"
Market moves to all-in-one platforms (LangGraph adds cost governance)	Medium	High	Stay framework-agnostic; the value is in not being locked in
apcore ecosystem has limited adoption, reducing network effects	Medium	Medium	Dogfood first; don't depend on external apcore adoption for v2 to be useful

Resource Risks

Risk	Likelihood	Impact	Mitigation
2-3 person team cannot deliver all P0 features	Medium	High	Strict phased delivery (alpha.1 -> alpha.2 -> beta.1); each phase is independently useful
Maintaining both v1 branch and v2	Low	Low	v1 branch is frozen; no maintenance commitment

9. Success Metrics

Phase 1: Dogfood (0.20.0 release)

apflow v2 successfully runs its own internal workflows with durable execution
Cost governance tracks and enforces budgets on real LLM tasks
apcore-mcp and apcore-a2a expose apflow modules without custom protocol code
Zero self-built protocol/CLI code remains in the codebase

Phase 2: Community (0.21.0+)

GitHub stars growth rate (baseline: current count at v2 launch)
PyPI monthly downloads (baseline: current at v2 launch)
At least 2 agent framework adapters validated by external users

Phase 3: Ecosystem (1.0.0)

3+ agent frameworks with validated adapters
External contributors submitting adapters or governance policies
Documented production deployments (even if small scale)

Note: Specific numeric targets are not set because this is an early-stage project. Growth rates matter more than absolute numbers.

10. Version Roadmap

Version	Codename	Scope	Features	Est. Duration
0.20.0-alpha.1	Phase 1: Slim & Bridge	Storage migration + slimming + apcore bridge	F-SM, F-001, F-002, F-005	2-3 weeks
0.20.0-alpha.2	Phase 2: Durability	Durable execution	F-003	2-3 weeks
0.20.0-beta.1	Phase 3: Governance	Cost governance	F-004	2-3 weeks
0.20.0	MVP Release	Stabilization, docs, dogfood	Bug fixes, documentation	1-2 weeks
0.21.0	Agent Adapters	Framework integration	F-006, F-007, F-008	4-6 weeks
1.0.0	Production-Ready	Stability, advanced features	F-009, F-010, hardening	TBD

Total estimated MVP timeline: 7-11 weeks (no hard deadline; quality first)

11. Open Questions

~~Checkpoint storage format:~~ RESOLVED. Checkpoints use a dedicated task_checkpoints table. A dedicated table allows larger checkpoints, independent cleanup, and full checkpoint history. See tech-design Section 4.4 and feature spec durable-execution.md.
~~Cost data source:~~ RESOLVED. Executors report token usage via the execute() return dict (token_usage: {input, output, total}). This is framework-agnostic and requires no LiteLLM dependency. See tech-design Section 4.5 and feature spec cost-governance.md.
~~apcore Module granularity:~~ RESOLVED. Each executor is a separate apcore Module (e.g., apflow.rest_executor, apflow.send_email_executor). Task management operations are also separate modules (apflow.task.create, etc.). See tech-design Section 4.3 and feature spec apcore-bridge.md.
Budget scope hierarchy: Should per-task budgets roll up into per-user budgets automatically, or are they independent? Hierarchical budgets add complexity but are more useful for platform teams. Note: MVP implements per-task budgets only. Per-user aggregation is available via UsageReporter but not enforced as a budget.
~~Python version floor:~~ RESOLVED. apflow v2 requires >=3.11, aligned with apcore. This is a breaking change from v1's >=3.10, accepted for v2. See feature spec project-slimming.md.
~~Distributed runtime in v2:~~ RESOLVED. Checkpoints are stored in the shared database. When a worker fails, another worker loads the checkpoint via the existing lease mechanism. No changes to leader election or task leasing. See tech-design Section 4.4 "Interaction with Distributed Runtime".
Test migration strategy: ~185 test files exist for v1. How many can be adapted vs. need rewriting? Should we establish a test baseline before slimming?

12. Appendix

A. Market Research Data

Statistic	Source Context
79% enterprise AI agent adoption	Market landscape analysis
11% of AI agents in production	Market landscape analysis
97M monthly MCP SDK downloads	MCP ecosystem data
100+ enterprise A2A supporters	A2A protocol data
Dify: 114k GitHub stars	Open source project data
n8n: 150k GitHub stars	Open source project data
Langfuse: 6M+ SDK installs/month	Open source project data

B. Competitive Landscape

Player	Category	Strengths	Weakness (for our use case)
LangGraph	Agent framework	Mature, checkpointing, large community	Ecosystem-locked, no cost governance
CrewAI	Agent framework	Easy multi-agent, good DX	No durable execution, no cost control
OpenAI Agents SDK	Agent framework	OpenAI integration, handoffs	Vendor-locked, no checkpointing
Temporal	Workflow engine	Battle-tested durability	Not agent-aware, complex setup, no cost governance
Dify	Visual orchestration	114k stars, visual builder	Not middleware, not for existing code
Langfuse	Observability	6M installs, MIT license	Observe only, no control or recovery
LiteLLM	LLM routing	Wide provider support	Routing only, no execution management

C. Code Inventory Summary

Category	Lines	Files	Percentage
Deleted (protocol/CLI/extensions)	~17,561	~64	36% of codebase
Preserved (core/scheduler/executors)	~30,965	~121	64% of codebase
New (bridge/governance/durability)	~3,000-5,000 (est.)	~15-25 (est.)	New code
Post-MVP estimated total	~34,000-36,000	~140-150	—

D. Feature Summary by Priority

Priority	Count	Features
P0 (MVP)	6	F-SM (Storage Migration), F-001 through F-005
P1 (0.21.0)	3	F-006 through F-008
P2 (Future)	2	F-009, F-010
Total	11	—

Note: F-SM (DuckDB-to-SQLite Storage Migration) was decomposed from F-001 during technical design as a separate deliverable. See docs/features/storage-migration.md.

FilesExpand file tree

prd.md

Latest commit

History

prd.md

File metadata and controls

apflow v2 — Product Requirements Document

1. Executive Summary

2. Problem Statement

Current Pain Points

Why Existing Solutions Don't Solve This

What Happens If We Don't Build This

3. Target Users

Primary Audience

User Personas

Jobs to Be Done

4. Product Vision & Positioning

Vision Statement

What apflow IS and IS NOT

Relationship to the apcore Ecosystem

Competitive Differentiation

5. Feature Requirements

P0 — MVP (v0.20.0): Must Have

F-001: Project Slimming

F-002: apcore Module Bridge

F-003: Durable Agent Execution

F-004: Cost Governance

F-005: TaskCreator Relaxation

P1 — v0.21.0: Should Have

F-006: Agent Adapters

F-007: TaskModel Agent Extensions

F-008: Observability Integration

P2 — Future: Nice to Have

F-009: Advanced Cost Strategies

F-010: Multi-Agent Coordination

6. Technical Architecture

Architecture Layers

Code Inventory

Dependency Changes

7. Non-Functional Requirements

8. Risks & Mitigations

Technical Risks

Product Risks

Resource Risks

9. Success Metrics

Phase 1: Dogfood (0.20.0 release)

Phase 2: Community (0.21.0+)

Phase 3: Ecosystem (1.0.0)

10. Version Roadmap

11. Open Questions

12. Appendix

A. Market Research Data

B. Competitive Landscape

C. Code Inventory Summary

D. Feature Summary by Priority