Positioning Update: The product has been repositioned from "AI Agent Production Middleware" to "AI-Perceivable Distributed Orchestration" (AP+Flow). apflow is a distributed task orchestration engine where every capability is AI-perceivable via the apcore module standard. This PRD retains original requirements and analysis. See README.md for current positioning.
Product: apflow v2 — AI-Perceivable Distributed Orchestration Version: 0.20.0 (MVP) Author: apflow team Date: 2026-03-28 Status: Draft
One-line positioning: apflow v2 is framework-agnostic production middleware that makes AI agents reliable, cost-governed, and auditable — without replacing your agent framework.
The problem: 79% of enterprises are adopting AI agents, but only 11% reach production. The gap is not in building agents — dozens of mature frameworks handle that — but in making them production-grade: durable execution, cost control, and governance.
The solution: apflow v2 sits between any agent framework and production infrastructure, providing durable execution (checkpoint/resume), cost governance (token budgets, model downgrade chains), and a unified module system via the apcore ecosystem. It does not compete with LangGraph, CrewAI, or any agent framework — it makes them production-ready.
Key differentiator: The apcore ecosystem (apcore, apcore-mcp, apcore-a2a, apcore-cli) — all stable, all available — gives apflow automatic protocol exposure (MCP, A2A) and CLI generation without building any of it from scratch. No competitor in the middleware space has this.
-
The Demo-to-Production Gap. Building an AI agent demo takes days. Getting it to production takes months. The missing pieces are not intelligence but reliability: what happens when an agent fails mid-execution? What happens when token costs spike? What happens when you need an audit trail?
-
Framework Lock-in. LangGraph offers checkpointing — but only for LangGraph agents. CrewAI has its own execution model. OpenAI Agents SDK has its own. There is no framework-agnostic durable execution layer.
-
Cost Unpredictability. Token budgets burn millions in enterprise settings. A single agentic loop with no cost guard can exhaust a monthly budget in hours. There is no mature, framework-agnostic cost governance solution.
-
Protocol Fragmentation. MCP (97M monthly SDK downloads) and A2A (100+ enterprise supporters) are both under the Linux Foundation. But integrating agents with these protocols requires significant custom wiring.
| Category | Examples | Gap |
|---|---|---|
| Agent Frameworks | LangGraph, CrewAI, OpenAI Agents SDK, Claude Agent SDK, Google ADK | Build agents, don't solve production reliability across frameworks |
| LLM Routing | LiteLLM, Portkey, Bifrost | Route API calls, don't manage agent execution lifecycle |
| Observability | Langfuse, LangSmith, AgentOps | Observe agents, don't control or recover them |
| Visual Orchestration | Dify, n8n, Flowise | Low-code builders, not middleware for existing code |
| Code Sandboxes | E2B, Modal, Daytona | Isolate execution, don't manage agent state or cost |
apflow v1 remains a task orchestration framework competing against massively funded platforms (Dify at 114k stars, n8n at 150k stars). It has no meaningful differentiation. The codebase carries ~17,500 lines of self-built protocol and CLI code that duplicates what apcore already does better.
Developers who are building AI agents with any framework and need to move them to production.
Persona 1: The AI Application Developer ("Alex")
- Building agentic features in a product (customer support bot, code assistant, data pipeline)
- Uses LangGraph or CrewAI or OpenAI Agents SDK
- Needs: reliable execution (retries, checkpoints), cost control, easy integration
- Pain: agents fail silently, costs are unpredictable, no audit trail
Persona 2: The AI Platform Engineer ("Jordan")
- Responsible for the internal AI platform at a mid-size company
- Multiple teams use different agent frameworks
- Needs: unified governance, cost policies across teams, protocol compliance (MCP/A2A)
- Pain: each team reinvents reliability, no unified cost reporting
Persona 3: The Independent Developer ("Sam")
- Solo or small team building AI-powered tools
- Cannot afford to build production infrastructure from scratch
- Needs: lightweight middleware that "just works," quick protocol exposure via MCP/A2A
- Pain: wants to ship fast, not build infra
- "I want my agent to recover from failures without losing progress."
- "I want to set a token budget and know it will be enforced."
- "I want my existing agent code exposed via MCP and A2A without rewriting anything."
- "I want to see how much each agent run costs and who triggered it."
Make any AI agent production-ready — regardless of which framework built it — by providing the reliability, governance, and protocol layers that every agent needs but no framework provides.
| apflow IS | apflow IS NOT |
|---|---|
| Production middleware for AI agents | An agent framework (use LangGraph/CrewAI/etc.) |
| Framework-agnostic reliability layer | A replacement for your existing stack |
| Cost governance and budget enforcement | An LLM routing layer (use LiteLLM/Portkey) |
| A bridge to MCP and A2A protocols | An observability platform (use Langfuse) |
| Built on the apcore module standard | A visual workflow builder |
+------------------------------------------------------------------+
| Protocol Layer |
| apcore-mcp (MCP server) | apcore-a2a (A2A) | apcore-cli |
+------------------------------------------------------------------+
| apcore (Module Standard) |
| Registry | Executor | ACL | Middleware | Observability |
+------------------------------------------------------------------+
| apflow v2 (This Product) |
| Durable Execution | Cost Governance | Task Orchestration |
+------------------------------------------------------------------+
| Agent Frameworks (Bring Your Own) |
| LangGraph | CrewAI | OpenAI SDK | Custom | Any |
+------------------------------------------------------------------+
apflow v2 registers its capabilities as apcore Modules. The apcore ecosystem automatically exposes them via MCP, A2A, and CLI. apflow does not build or maintain any protocol servers.
| Capability | apflow v2 | LangGraph | Temporal + LLM | Custom build |
|---|---|---|---|---|
| Framework-agnostic | Yes | No (LangGraph only) | Partial | N/A |
| Durable execution | Yes | Yes (LangGraph only) | Yes (complex setup) | DIY |
| Cost governance | Yes (built-in) | No | No | DIY |
| MCP/A2A protocol exposure | Yes (via apcore) | No | No | DIY |
| CLI auto-generation | Yes (via apcore-cli) | No | No | DIY |
| Setup complexity | pip install | pip install | Significant infra | Months |
Description: Remove all self-built protocol layers, CLI, and extensions that are superseded by the apcore ecosystem or out of scope for v2.
What to delete (~19,938 lines, ~76 files):
| Module | Lines | Files | Replacement |
|---|---|---|---|
api/a2a/ |
~1,840 | 7 | apcore-a2a |
api/mcp/ |
~933 | 8 | apcore-mcp |
api/graphql/ |
~646 | 9 | None (dropped) |
api/docs/ |
~727 | 3 | apcore-mcp Tool Explorer |
cli/ |
~6,979 | 18 | apcore-cli |
extensions/crewai/ |
~1,747 | 4 | Future: thin agent adapter |
extensions/llm/ |
~251 | 2 | Future: agent adapter |
extensions/llm_key_config/ |
~158 | 2 | Dropped (dead — no importers) |
extensions/generate/ |
~3,651 | 8 | Dropped (code generation) |
extensions/grpc/ |
~314 | 2 | Dropped |
extensions/tools/ |
~473 | 3 | Dropped |
extensions/ssh/ |
~342 | 2 | Dropped (out of core scope) |
extensions/docker/ |
~394 | 2 | Dropped (out of core scope) |
extensions/scrape/ |
~117 | 2 | Dropped (out of core scope) |
extensions/websocket/ |
~277 | 2 | Dropped (out of core scope) |
extensions/mcp/ |
~523 | 2 | Dropped (apcore-mcp replaces client usage) |
extensions/stdio/ |
~566 | 2 | Dropped (out of core scope) |
What to update:
pyproject.toml: remove deleted extras (a2a,cli,graphql,crewai,llm,grpc,tools,standard,all); addapcore>=0.14.0,apcore-mcp>=0.10.1,apcore-a2a,apcore-cli>=0.3.0as dependencies or extras__init__.py: update description to "AI Agent Production Middleware", version to0.20.0- Remove
project.scriptsentries (apflow,apflow-server)
Acceptance criteria:
- All listed directories are deleted from the source tree.
pyproject.tomlhas no references to deleted modules.pip install apflowsucceeds with only core + apcore dependencies.- Remaining test suite passes (tests for deleted modules are also removed).
- Total source line count drops from ~48.5k to ~25.2k.
Priority: P0 Estimated effort: 3-5 days
Description: Register apflow's capabilities as apcore Modules so that apcore-mcp, apcore-a2a, and apcore-cli can automatically expose them.
Requirements:
- Create
bridge/module with apcore registration logic. - Adapt
ExecutableTaskto implement the apcore Module interface (schema-enforced input/output, metadata). - Register built-in executors as apcore Modules:
- REST executor ->
apflow.restmodule - Email executor ->
apflow.emailmodule - AggregateResults executor ->
apflow.aggregate_resultsmodule - ApFlowApi executor ->
apflow.apflow_api_executormodule
- REST executor ->
- Register TaskRoutes handlers (create, execute, list, get, delete) as apcore Modules.
- Initialize apcore Registry on apflow startup.
Acceptance criteria:
apcore-mcpcan start and expose all registered modules as MCP tools.apcore-a2acan start and expose registered modules as A2A skills.apcore-clican generate CLI commands for registered modules.- Each module has proper JSON Schema for inputs and outputs.
- Module registration is declarative and requires no code changes in executors.
Priority: P0 Estimated effort: 5-8 days
Description: Enable checkpoint/resume for long-running agent tasks so that failures do not lose progress.
Requirements:
TaskModel extensions:
checkpoint_at: Optional[datetime]— when the last checkpoint was savedresume_from: Optional[str]— checkpoint ID to resume fromattempt_count: int— number of execution attempts (default 0)max_attempts: int— maximum retry attempts (default 3)backoff_strategy: Optional[str]— "fixed", "exponential", "linear" (default "exponential")backoff_base_seconds: float— base delay between retries (default 1.0)
ExecutableTask interface extensions:
supports_checkpoint() -> bool— whether this task type supports checkpointingget_checkpoint() -> Optional[dict]— serialize current stateresume_from_checkpoint(checkpoint: dict) -> None— restore state from checkpoint
Checkpoint storage:
- Checkpoints stored in a separate
task_checkpointstable (not on TaskModel) — allows large checkpoints, independent lifecycle, and full checkpoint history - Database migration 004 for new fields and table
- Checkpoint data is JSON-serializable; binary data stored as base64
Retry logic:
- On task failure, check
attempt_count < max_attempts - Apply backoff strategy to calculate delay
- Save checkpoint before retry if task supports it
- Resume from checkpoint on retry if available
Circuit breaker:
- Strengthen existing timeout capabilities
- Add
circuit_breaker_threshold: int— consecutive failures before circuit opens (default 5) - Add
circuit_breaker_reset_seconds: float— time before circuit half-opens (default 60.0)
Acceptance criteria:
- A task that fails mid-execution can be retried and resumes from its last checkpoint.
- Retry backoff follows the configured strategy (fixed, exponential, linear).
- Circuit breaker prevents repeated execution of consistently failing tasks.
- Checkpoint data persists across process restarts.
- Database migration 004 applies cleanly on both SQLite and PostgreSQL.
- Tasks that do not support checkpointing still benefit from retry logic (they restart from the beginning).
Priority: P0 Estimated effort: 8-12 days
Description: Provide token budget management and cost policy enforcement for AI agent tasks.
Requirements:
TaskModel extensions:
token_usage: Optional[dict]—{"input": int, "output": int, "total": int}token_budget: Optional[int]— maximum tokens allowed for this taskestimated_cost_usd: Optional[float]— pre-execution cost estimateactual_cost_usd: Optional[float]— post-execution actual costcost_policy: Optional[str]— policy name to apply
New module: governance/
governance/budget.py:
TokenBudgetdataclass:limit,used,remaining,scope(task/user/global)BudgetManager: track usage, check limits, aggregate across scopes- Per-task budgets checked before and during execution
governance/policy.py:
CostPolicydataclass:name,action,threshold,downgrade_chain- Actions:
block(reject task),downgrade(use cheaper model),notify(log warning, continue),continue(no action) PolicyEngine: evaluate policies against current usage
governance/provider_router.py:
- Model downgrade chains: configurable sequences (e.g.,
claude-opus-4-20250514->claude-sonnet-4-20250514->claude-haiku-4-20250514) - When budget threshold is hit and policy is
downgrade, automatically select next model in chain - Provider-agnostic: works with any model identifier string
governance/reporter.py:
- Usage summary by task, user, time period
- Export as JSON for integration with external dashboards
- No built-in UI (use apcore-mcp Tool Explorer or external tools)
Acceptance criteria:
- A task with
token_budget=1000is blocked (or downgraded) when usage approaches the limit. - Cost policies can be defined in configuration and applied by name.
- Model downgrade chains work: when budget triggers downgrade, the next model in the chain is used.
- Token usage is recorded on every task that involves LLM calls.
- Usage reports can be generated for any time period.
- All cost data persists in the database.
Priority: P0 Estimated effort: 8-10 days
Description: Remove the single-root constraint from TaskCreator, allowing multi-root task forests.
Requirements:
parent_idbecomes truly optional (no default root creation)- Multiple independent task trees can coexist
- All existing validation preserved: circular dependency detection, reference validation, duplicate ID detection
- No changes to TaskManager execution logic (it already handles independent tasks)
Acceptance criteria:
- Tasks can be created without
parent_idand exist as independent roots. - Multiple root tasks can coexist in the same session.
- Circular dependency detection still works correctly.
- Existing task creation patterns (with parent_id) continue to work unchanged.
Priority: P0 Estimated effort: 1-2 days
Description: Provide thin adapter layers that wrap popular agent frameworks as apflow executable tasks.
Requirements:
AgentAdapterProtocol:async execute(input: dict) -> dict,supports_checkpoint() -> bool,get_checkpoint() -> Optional[dict]GenericAdapter: wraps anyasync callableas an agent adapterLangGraphAdapter: wraps a LangGraph compiled graph, maps LangGraph checkpoints to apflow checkpointsCrewAIAdapter: wraps a CrewAI crew (thin, not the old heavy extension)OpenAIAgentsAdapter: wraps OpenAI Agents SDK runner
Acceptance criteria:
- Each adapter can execute its target framework's agent as an apflow task.
- LangGraph adapter preserves LangGraph's native checkpoint data.
- All adapters are optional dependencies (not required for core apflow).
- Each adapter is <300 lines of code.
Priority: P1 Estimated effort: 5-8 days per adapter
Description: Add agent-specific metadata fields to TaskModel.
Requirements:
agent_framework: Optional[str]— which framework produced this task ("langgraph", "crewai", "openai", etc.)agent_model: Optional[str]— the LLM model usedagent_config: Optional[dict]— framework-specific configuration
Acceptance criteria:
- Fields are optional and do not affect existing task behavior.
- Database migration applies cleanly.
- Fields are exposed via apcore module metadata.
Priority: P1 Estimated effort: 1-2 days
Description: Integrate with Langfuse and OpenTelemetry for cost and usage reporting.
Requirements:
- Langfuse integration: export cost/usage data from governance module to Langfuse traces
- OpenTelemetry: export execution spans with task metadata
- Both integrations are optional dependencies
Acceptance criteria:
- Langfuse receives cost data for every LLM-involving task execution.
- OpenTelemetry spans include task ID, execution time, token usage, and cost.
- Neither integration is required for apflow to function.
Priority: P1 Estimated effort: 3-5 days
Description: Predictive cost analysis and cross-provider comparison.
Requirements:
- Pre-execution cost prediction based on input size and model pricing
- Cross-provider cost comparison (same prompt, different providers)
- Historical cost analytics with trend detection
Acceptance criteria:
- Cost predictions are within 20% of actual cost for standard workloads.
- Historical analytics are queryable by time range, user, and task type.
Priority: P2 Estimated effort: 5-8 days
Description: Orchestrate agents across frameworks using A2A protocol.
Requirements:
- Cross-framework agent orchestration: a LangGraph agent can delegate to a CrewAI agent via A2A
- Agent result aggregation: combine outputs from multiple agents
- Verification: cross-check agent outputs for consistency
Acceptance criteria:
- Two agents from different frameworks can collaborate on a task via A2A.
- Results from multiple agents can be aggregated into a single output.
Priority: P2 Estimated effort: 10-15 days
Layer 4: Protocol Exposure (apcore ecosystem — NOT built by apflow)
┌─────────────┬─────────────┬─────────────┐
│ apcore-mcp │ apcore-a2a │ apcore-cli │
└─────────────┴─────────────┴─────────────┘
Layer 3: Module Standard (apcore — NOT built by apflow)
┌─────────────────────────────────────────────┐
│ Registry │ Executor │ ACL │ Middleware │ Obs │
└─────────────────────────────────────────────┘
Layer 2: apflow v2 (THIS PRODUCT)
┌──────────────┬──────────────┬───────────────┐
│ Durable │ Cost │ Module │
│ Execution │ Governance │ Bridge │
├──────────────┴──────────────┴───────────────┤
│ Task Orchestration Engine │
│ TaskManager │ TaskCreator │ TaskExecutor │
├─────────────────────────────────────────────┤
│ Executors (REST, Email, Aggregate, Api) │
├─────────────────────────────────────────────┤
│ Storage (SQLite / PostgreSQL) │
└─────────────────────────────────────────────┘
Layer 1: Agent Frameworks (USER'S CHOICE — not built by apflow)
┌───────────┬─────────┬──────────┬────────────┐
│ LangGraph │ CrewAI │ OpenAI │ Any async │
│ │ │ Agents │ callable │
└───────────┴─────────┴──────────┴────────────┘
What is deleted (~19,938 lines):
| Module | Lines | Reason |
|---|---|---|
api/a2a/ |
1,840 | Replaced by apcore-a2a |
api/mcp/ |
933 | Replaced by apcore-mcp |
api/graphql/ |
646 | Dropped |
api/docs/ |
727 | Replaced by apcore-mcp Tool Explorer |
cli/ |
6,979 | Replaced by apcore-cli |
extensions/crewai/ |
1,747 | Replaced by thin adapter (P1) |
extensions/llm/ |
251 | Replaced by agent adapter (P1) |
extensions/llm_key_config/ |
158 | Dropped (dead — no importers) |
extensions/generate/ |
3,651 | Dropped (code generation out of scope) |
extensions/grpc/ |
314 | Dropped |
extensions/tools/ |
473 | Dropped |
extensions/ssh/ |
342 | Dropped (out of core scope) |
extensions/docker/ |
394 | Dropped (out of core scope) |
extensions/scrape/ |
117 | Dropped (out of core scope) |
extensions/websocket/ |
277 | Dropped (out of core scope) |
extensions/mcp/ |
523 | Dropped (apcore-mcp replaces client usage) |
extensions/stdio/ |
566 | Dropped (out of core scope) |
What is preserved (~25,249 lines):
| Module | Lines | Notes |
|---|---|---|
core/ |
18,319 | Task engine, models, executors, distributed runtime — the heart of apflow |
scheduler/ |
2,028 | Task scheduling |
extensions/http/ |
346 | HTTP/REST executor |
extensions/email/ |
425 | Email executor |
extensions/apflow/ |
651 | ApFlowApi executor |
extensions/core/ |
234 | AggregateResults executor |
extensions/hooks/ |
153 | Hook system |
extensions/storage/ |
162 | Storage extension |
| Other files | ~2,885 | adapters/, app.py, bridge/, cli.py, durability/, governance/, logger.py, etc. |
What is new (estimated ~3,000-5,000 lines):
| Module | Est. Lines | Description |
|---|---|---|
bridge/ |
500-800 | apcore Module registration |
governance/ |
800-1,200 | Budget, policy, provider router, reporter |
durability/ |
600-1,000 | Checkpoint, retry, circuit breaker logic |
| TaskModel extensions | 200-300 | New fields and migration |
| Tests | 1,000-2,000 | Unit tests for new modules |
Removed from pyproject.toml:
a2a-sdk,fastapi,uvicorn,starlette,websockets(from a2a extra)click,rich,typer,python-dotenv,nest_asyncio(from cli extra)strawberry-graphql(from graphql extra)crewai,litellm,anthropic(from crewai extra)grpclib,protobuf(from grpc extra)
Added to pyproject.toml:
apcore>=0.14.0(core dependency)apcore-mcp>=0.10.1(optional: MCP exposure)apcore-a2a(optional: A2A exposure)apcore-cli>=0.3.0(optional: CLI generation)
| Requirement | Target | Measurement |
|---|---|---|
| Execution overhead | <50ms per task dispatch | Benchmark: 1000 no-op tasks |
| Checkpoint recovery | >95% success rate | Test: kill/resume 100 checkpointed tasks |
| Schema migrations | Backward compatible | Migration runs on existing v1 databases without data loss |
| Python version | >=3.11 | Aligned with apcore requirement |
| Test coverage | >=90% on new code | pytest --cov on bridge/, governance/, durability/ |
| Existing test adaptation | All preserved modules pass | Adapted tests for deleted imports/dependencies |
| Package size | Core install <5MB | pip install apflow without optional extras |
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| apcore Module interface does not fit all executor patterns | Medium | High | Prototype bridge with 2-3 executors first; negotiate interface changes with apcore if needed |
| Checkpoint serialization complexity (non-JSON-serializable state) | Medium | Medium | Restrict checkpoints to JSON-serializable data; provide base64 escape hatch for binary |
| SQLite migration compatibility | Low | High | Test migration 004 on both SQLite and PostgreSQL before release |
| Performance regression from apcore middleware pipeline | Low | Medium | Benchmark with and without apcore; bypass middleware for internal execution paths if needed |
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| "Production middleware" positioning is too abstract for developers | Medium | High | Lead with concrete use cases: "add retries and cost limits to your LangGraph agent in 5 minutes" |
| Market moves to all-in-one platforms (LangGraph adds cost governance) | Medium | High | Stay framework-agnostic; the value is in not being locked in |
| apcore ecosystem has limited adoption, reducing network effects | Medium | Medium | Dogfood first; don't depend on external apcore adoption for v2 to be useful |
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| 2-3 person team cannot deliver all P0 features | Medium | High | Strict phased delivery (alpha.1 -> alpha.2 -> beta.1); each phase is independently useful |
| Maintaining both v1 branch and v2 | Low | Low | v1 branch is frozen; no maintenance commitment |
- apflow v2 successfully runs its own internal workflows with durable execution
- Cost governance tracks and enforces budgets on real LLM tasks
- apcore-mcp and apcore-a2a expose apflow modules without custom protocol code
- Zero self-built protocol/CLI code remains in the codebase
- GitHub stars growth rate (baseline: current count at v2 launch)
- PyPI monthly downloads (baseline: current at v2 launch)
- At least 2 agent framework adapters validated by external users
- 3+ agent frameworks with validated adapters
- External contributors submitting adapters or governance policies
- Documented production deployments (even if small scale)
Note: Specific numeric targets are not set because this is an early-stage project. Growth rates matter more than absolute numbers.
| Version | Codename | Scope | Features | Est. Duration |
|---|---|---|---|---|
| 0.20.0-alpha.1 | Phase 1: Slim & Bridge | Storage migration + slimming + apcore bridge | F-SM, F-001, F-002, F-005 | 2-3 weeks |
| 0.20.0-alpha.2 | Phase 2: Durability | Durable execution | F-003 | 2-3 weeks |
| 0.20.0-beta.1 | Phase 3: Governance | Cost governance | F-004 | 2-3 weeks |
| 0.20.0 | MVP Release | Stabilization, docs, dogfood | Bug fixes, documentation | 1-2 weeks |
| 0.21.0 | Agent Adapters | Framework integration | F-006, F-007, F-008 | 4-6 weeks |
| 1.0.0 | Production-Ready | Stability, advanced features | F-009, F-010, hardening | TBD |
Total estimated MVP timeline: 7-11 weeks (no hard deadline; quality first)
-
Checkpoint storage format:RESOLVED. Checkpoints use a dedicatedtask_checkpointstable. A dedicated table allows larger checkpoints, independent cleanup, and full checkpoint history. See tech-design Section 4.4 and feature specdurable-execution.md. -
Cost data source:RESOLVED. Executors report token usage via theexecute()return dict (token_usage: {input, output, total}). This is framework-agnostic and requires no LiteLLM dependency. See tech-design Section 4.5 and feature speccost-governance.md. -
apcore Module granularity:RESOLVED. Each executor is a separate apcore Module (e.g.,apflow.rest_executor,apflow.send_email_executor). Task management operations are also separate modules (apflow.task.create, etc.). See tech-design Section 4.3 and feature specapcore-bridge.md. -
Budget scope hierarchy: Should per-task budgets roll up into per-user budgets automatically, or are they independent? Hierarchical budgets add complexity but are more useful for platform teams. Note: MVP implements per-task budgets only. Per-user aggregation is available via
UsageReporterbut not enforced as a budget. -
Python version floor:RESOLVED. apflow v2 requires>=3.11, aligned with apcore. This is a breaking change from v1's>=3.10, accepted for v2. See feature specproject-slimming.md. -
Distributed runtime in v2:RESOLVED. Checkpoints are stored in the shared database. When a worker fails, another worker loads the checkpoint via the existing lease mechanism. No changes to leader election or task leasing. See tech-design Section 4.4 "Interaction with Distributed Runtime". -
Test migration strategy: ~185 test files exist for v1. How many can be adapted vs. need rewriting? Should we establish a test baseline before slimming?
| Statistic | Source Context |
|---|---|
| 79% enterprise AI agent adoption | Market landscape analysis |
| 11% of AI agents in production | Market landscape analysis |
| 97M monthly MCP SDK downloads | MCP ecosystem data |
| 100+ enterprise A2A supporters | A2A protocol data |
| Dify: 114k GitHub stars | Open source project data |
| n8n: 150k GitHub stars | Open source project data |
| Langfuse: 6M+ SDK installs/month | Open source project data |
| Player | Category | Strengths | Weakness (for our use case) |
|---|---|---|---|
| LangGraph | Agent framework | Mature, checkpointing, large community | Ecosystem-locked, no cost governance |
| CrewAI | Agent framework | Easy multi-agent, good DX | No durable execution, no cost control |
| OpenAI Agents SDK | Agent framework | OpenAI integration, handoffs | Vendor-locked, no checkpointing |
| Temporal | Workflow engine | Battle-tested durability | Not agent-aware, complex setup, no cost governance |
| Dify | Visual orchestration | 114k stars, visual builder | Not middleware, not for existing code |
| Langfuse | Observability | 6M installs, MIT license | Observe only, no control or recovery |
| LiteLLM | LLM routing | Wide provider support | Routing only, no execution management |
| Category | Lines | Files | Percentage |
|---|---|---|---|
| Deleted (protocol/CLI/extensions) | ~17,561 | ~64 | 36% of codebase |
| Preserved (core/scheduler/executors) | ~30,965 | ~121 | 64% of codebase |
| New (bridge/governance/durability) | ~3,000-5,000 (est.) | ~15-25 (est.) | New code |
| Post-MVP estimated total | ~34,000-36,000 | ~140-150 | — |
| Priority | Count | Features |
|---|---|---|
| P0 (MVP) | 6 | F-SM (Storage Migration), F-001 through F-005 |
| P1 (0.21.0) | 3 | F-006 through F-008 |
| P2 (Future) | 2 | F-009, F-010 |
| Total | 11 | — |
Note: F-SM (DuckDB-to-SQLite Storage Migration) was decomposed from F-001 during technical design as a separate deliverable. See docs/features/storage-migration.md.