Skip to content

Agent-Field/agentfield

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,192 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

AgentField — The AI Backend

Build agents like APIs. Run ten thousand of them like microservices.

One request fans out to thousands of agents. The control plane queues, retries, and traces every branch.

Stars License Downloads Coverage Last Commit Discord

Docs · Quick Start · Python SDK · Go SDK · TypeScript SDK · REST API · Examples · Discord

AgentField is an open-source control plane that lets you build AI agents callable by any service in your stack - frontends, backends, other agents, cron jobs - just like any other API. You write agent logic in Python, Go, or TypeScript. AgentField turns it into production infrastructure: routing, coordination, memory, async execution, and observability. Every function becomes a REST endpoint, and the same code scales from one agent on your laptop to ten thousand in a single workflow: the control plane handles the fan-out, the queues, and the retries.

agentfield-quick-start.mp4

One prompt → a running containerized production ready multi-agent backend. No glue code, start using the agent API!

Build production agents with a prompt.

Describe the system in one line. Get a production-ready multi-agent backend. Works in Claude Code, Codex, Gemini CLI, OpenCode, Aider, Windsurf, and Cursor.

curl -fsSL https://agentfield.ai/install.sh | bash

Then in your coding agent, paste any spec with /agentfield :

/agentfield Build a claims processor with risk scoring, pattern detection,
and human approval for low-confidence decisions.

You get a Docker Compose stack wired up end-to-end — the agent, the control plane, and a production ready REST API endpoint you can paste and curl into a terminal to try it. See it in action →

The DX you get

Plain Python (or Go / TypeScript) functions. No DSL, no YAML, no graph wiring.

import asyncio
from agentfield import Agent, AIConfig
from pydantic import BaseModel

app = Agent(
    node_id="researcher",
    version="1.0.0",# Canary deploys, A/B testing, blue-green rollouts
    ai_config=AIConfig(model="anthropic/claude-sonnet-4-20250514"),
)

class SubQuestions(BaseModel):
    questions: list[str]

@app.reasoner(tags=["research"])
async def research(question: str, depth: int = 0, model: str | None = None) -> dict:

    if depth >= 3:  # depth cap keeps fan-out bounded
        answer = await app.ai(system="Answer directly and concisely.", user=question, model=model)
        return {"question": question, "answer": answer}

    # Break the question into sub-questions
    plan = await app.ai(
        system="Break this into 3-5 independent sub-questions.",
        user=question, schema=SubQuestions, model=model,
    )

    # Fan out: each sub-question recurses on this same agent, through the control plane
    branches = await asyncio.gather(*[
        app.call(f"{app.node_id}.research", question=q, depth=depth + 1, model=model)
        for q in plan.questions
    ])

    # Synthesize the branches back into one answer
    synthesis = await app.ai(system="Synthesize these findings.", user=str(branches), model=model)
    return {"question": question, "answer": synthesis, "branches": branches}

app.run()
# This single line exposes: POST /api/v1/execute/researcher.research
# One request fans out to thousands of agents. The control plane queues, retries, and traces
# every branch. No broker, no queue setup, no timeout.

What you just saw: app.ai() calls an LLM and returns structured output. app.call() routes to other agents (or back to itself) through the control plane, so recursion becomes distributed fan-out. asyncio.gather() runs every branch in parallel. app.run() auto-exposes everything as REST. Read the full docs →

Need approvals, audit trails, and governance? (the enterprise sample)
from agentfield import Agent, AIConfig
from pydantic import BaseModel

app = Agent(
    node_id="claims-processor",
    version="2.1.0",# Canary deploys, A/B testing, blue-green rollouts
    ai_config=AIConfig(model="anthropic/claude-sonnet-4-20250514"),
)

class Decision(BaseModel):
    action: str# "approve", "deny", "escalate"
    confidence: float
    reasoning: str

@app.reasoner(tags=["insurance", "critical"])
async def evaluate_claim(claim: dict) -> dict:

    # Structured AI judgment - returns typed Pydantic output
    decision = await app.ai(
        system="Insurance claims adjuster. Evaluate and decide.",
        user=f"Claim #{claim['id']}: {claim['description']}",
        schema=Decision,
    )

    if decision.confidence < 0.85:
        # Human approval - suspends execution, notifies via webhook, resumes when approved
        await app.pause(
            approval_request_id=f"claim-{claim['id']}",
            approval_request_url=f"https://internal.acme.com/approvals/claim-{claim['id']}",
            expires_in_hours=48,
        )

    # Route to the next agent - traced through the control plane
    await app.call("notifier.send_decision", input={
        "claim_id": claim["id"],
        "decision": decision.model_dump(),
    })

    return decision.model_dump()

app.run()
# This single line exposes: POST /api/v1/execute/claims-processor.evaluate_claim
# The agent auto-registers with the control plane, gets a cryptographic identity, and every
# execution produces a verifiable, tamper-proof audit trail.

What you just saw: app.ai() calls an LLM and returns structured output. app.pause() suspends for human approval. app.call() routes to other agents through the control plane. app.run() auto-exposes everything as REST. Read the full docs →

Prefer to scaffold by hand? (Python / Go / TypeScript / Docker)
af init my-agent --defaults                            # Scaffold agent
cd my-agent && pip install -r requirements.txt
af server          # Terminal 1 → Dashboard at http://localhost:8080
python main.py     # Terminal 2 → Agent auto-registers
# Call your agent
curl -X POST http://localhost:8080/api/v1/execute/my-agent.demo_echo \
  -H "Content-Type: application/json" \
  -d '{"input": {"message": "Hello!"}}'
# Go
af init my-agent --defaults --language go && cd my-agent && go run .

# TypeScript
af init my-agent --defaults --language typescript && cd my-agent && npm install && npm run dev

# Docker (control plane only)
docker run -p 8080:8080 agentfield/control-plane:latest

Deployment guide → for Docker Compose, Kubernetes, and production setups.

See it in action

AgentField Dashboard
Real-time workflow DAGs · Execution traces · Agent fleet management · Audit trails

How AgentField fits in your stack

Most agent tools help you write agent logic. AgentField is what runs it in production: the layer that makes agents callable by other software, durable across failures, and observable when one request fans out to a thousand branches. Keep the framework you already use for authoring; a reasoner is a plain function, so existing LangGraph or CrewAI code can run inside one.

Frameworks
LangChain · CrewAI · PydanticAI · OpenAI Agents SDK
Workflow engines
Temporal · Airflow
Visual builders
n8n · Zapier
AgentField
Build agent logic (prompts, tools, structured output)
Prebuilt chains, retrievers, integrations
Production REST APIs out of the box
Async + retries + webhooks
Memory scopes (global · agent · session · run)
Service discovery + cross-agent calls
Distributed agents (register from anywhere, one mesh)
Coding agents as functions (Claude Code · Codex · CLI)
Agent identity, access policies, signed audit trails
Fleet observability (DAGs · metrics · traces)
Multi-language SDKs (Python · Go · TypeScript)

● full · ◐ partial · — not the focus

Prototype in whatever you like. The moment a second service needs to call your agent, put it on AgentField. That is the point where you would otherwise start writing queues, retries, discovery, and tracing by hand.

Full comparison & decision guide →

How it scales

The control plane is a stateless Go service. You put more of them behind a load balancer and the fleet grows horizontally. Work lands in a durable PostgreSQL queue with lease-based processing, so a crash or a restart resumes where it left off instead of dropping the job.

Property What it means
Stateless Go control plane Horizontal scaling behind a load balancer. Add replicas to add capacity.
Durable PostgreSQL queue Lease-based processing. Jobs survive crashes and restarts.
Async execution Webhooks and SSE, no timeout limits. A single run can go for hours or days.
Backpressure Queue-depth limits and circuit breakers keep a fan-out from overwhelming downstream agents.
Routing overhead Roughly 100-200ms per cross-agent hop. It matters when a branch does little work per hop, so keep hops coarse when latency is tight.

Two examples already run at this load. The deep-research engine fanned out 10,000+ agent invocations in one workflow. The security auditor runs 250 coordinated agents per audit.

Deployment guide → for Docker Compose, Kubernetes, and production setups.

What You Get

Build - Python, Go, or TypeScript. Every function becomes a REST endpoint.

  • Reasoners & Skills - @app.reasoner() for AI judgment, @app.skill() for deterministic code
  • Structured AI - app.ai(schema=MyModel) → typed Pydantic/Zod output from any LLM
  • Harness - app.harness("Fix the bug") dispatches multi-turn tasks to Claude Code, Codex, Gemini CLI, or OpenCode
  • Cross-Agent Calls - app.call("other-agent.func") routes through the control plane with full tracing
  • Discovery - app.discover(tags=["ml*"]) finds agents and capabilities across the mesh. tools="discover" lets LLMs auto-invoke them.
  • Memory - app.memory.set() / .get() / .search() - KV + vector search, four scopes, no Redis needed

Scale - Production infrastructure for non-deterministic AI.

  • Async Execution - Fire-and-forget with webhooks, SSE streaming, retries. No timeout limits - agents run for hours or days.
  • Canary Deployments - Traffic weight routing, A/B testing, blue-green deploys. Roll out agent versions at 5% → 50% → 100%.
  • Human-in-the-Loop - app.pause() suspends execution for human approval. Crash-safe, durable, audited.
  • Observability - Automatic workflow DAGs, Prometheus /metrics, structured logs, execution timeline.

Govern - IAM for AI agents. Every agent gets a cryptographic identity. Identity, access control, and audit trails - built in.

  • Cryptographic Identity - Every agent gets a W3C DID (decentralized identifier) - not a shared API key. Agents authenticate to each other the way services authenticate with mTLS, but with cryptographic signatures that travel with the agent.
  • Verifiable Credentials - Tamper-proof receipt for every execution. Offline-verifiable: af vc verify audit.json.
  • Policy Enforcement - Tag-based policy gates with cryptographic verification. "Only agents tagged 'finance' can call this" - enforced by infrastructure, not prompts.

See the full production-ready feature set →

90+ Production Features

▼ Click to expand full capabilities

AI & LLM

Feature How
Structured output (Pydantic/Zod) app.ai(schema=MyModel)
Multi-turn coding agents app.harness("task", provider="claude-code")
LLM auto-discovers agents and tools app.ai(tools="discover")
Multimodal (text, image, audio) app.ai("Describe", image_url="...")
Streaming responses app.ai("...", stream=True)
100+ LLMs via LiteLLM AIConfig(model="anthropic/claude-sonnet-4-20250514")
Temperature, max tokens, format app.ai(..., temperature=0.2)

Agent Mesh & Discovery

Feature How
Cross-agent calls with tracing app.call("agent.func", input={...})
Discover agents by tag (wildcards) app.discover(tags=["ml*"])
Discover by health status app.discover(health_status="active")
Agent routers (namespacing) AgentRouter(prefix="billing")
Auto context propagation Workflow, session, actor IDs forwarded
Parallel agent execution asyncio.gather(app.call(...), ...)
Auto-registration on startup Service mesh with zero config

Execution Engine

Feature How
Sync execution (REST) POST /api/v1/execute/{agent}.{func}
Async (fire-and-forget) POST /api/v1/execute/async/{agent}.{func}
Webhooks + HMAC-SHA256 signing AsyncConfig(webhook_url="...", secret="...")
SSE streaming (real-time) /api/v1/execute/stream/{id}
No timeout limits (hours/days) Control plane allows unlimited duration
Execution polling GET /api/v1/executions/{id}
Batch status checks POST /api/v1/executions/batch-status
Progress updates mid-execution Intermediate payloads during long tasks
Auto retries + exponential backoff Transparent - control plane handles
Backpressure + queue depth limits Fair scheduling, circuit breakers
Durable queue (PostgreSQL) Atomic lease-based processing

Memory (Distributed State)

Feature How
Key-value storage app.memory.set(key, value) / .get(key)
Vector search (semantic) app.memory.search(embedding, top_k=5)
Four scopes Global, agent, session, run
Reactive memory events @app.memory.on_change("order_*")
Metadata filtering Filter stored values by metadata
Zero dependencies Built into control plane - no Redis

Human-in-the-Loop

Feature How
Durable pause/resume await app.pause(reason="...")
Approval workflows with UI approval_request_url for reviewers
Configurable timeouts expires_in_hours=24 + auto-escalation
Crash-safe state Survives agent restarts

Canary Deployments & Versioning

Feature How
Traffic weight routing 5% → 50% → 100% rollouts
A/B testing 50/50 splits with X-Routed-Version
Blue-green deployments Instant weight switch, zero downtime
Per-version health tracking Unhealthy versions auto-removed
Agent lifecycle states pending → starting → ready → degraded → offline

Identity & Governance

Feature How
Cryptographic identity per agent Auto-generated W3C DID + Ed25519 keys
Verifiable Credentials Tamper-proof receipt per execution
Offline VC verification af vc verify audit.json
Tag-based access policies ALLOW/DENY rules on caller → target tags
Cryptographically signed requests Ed25519 signatures on cross-agent calls
VC hierarchy (3 tiers) Platform → Node → Function control
Agent notes (audit log) app.note("Decision", tags=["critical"])
Non-repudiation Cryptographic proof of actions
Permission request workflows Auto-created when access denied

Observability & Fleet Management

Feature How
Automatic DAG visualization Workflow graphs in dashboard
Prometheus metrics /metrics out of the box
Structured JSON logging Automatic from SDK
Execution timeline Chronological decision trace
Health checks (K8s-ready) /health, /ready endpoints
Correlation IDs X-Workflow-ID, X-Execution-ID
Workflow DAG API GET /api/v1/workflows/{id}/dag
Agent heartbeat monitoring Auto health status transitions

Harness (Multi-turn Coding Agents)

Feature How
4 providers Claude Code, Codex, Gemini CLI, OpenCode
Schema-constrained output schema=ResultModel (Pydantic/Zod)
Cost capping max_budget_usd=3.0
Turn limiting max_turns=100
Tool access control tools=["Read", "Write", "Bash"]
Environment injection env={"KEY": "value"}
System prompt override system_prompt="..."
Multi-layer output recovery Cosmetic repair → retry → full retry

Connector API (Fleet Management)

Feature How
Remote agent management /connector/reasoners
Version traffic control /connector/.../weight
Bearer token auth AGENTFIELD_CONNECTOR_TOKEN
Air-gapped deployment Outbound WebSocket only

Developer Experience

Feature How
CLI scaffolding af init my-agent --defaults --language python|go|typescript
Local dev with dashboard af serverhttp://localhost:8080
Hot reload af dev auto-detects changes
Auto-REST from decorators Every @app.reasoner()POST /api/v1/execute/...
Python, Go, TypeScript SDKs Native patterns per language
MCP server integration af add --mcp --url <server>
Config storage API POST /api/v1/configs/:key - database-backed
Docker + Kubernetes ready Stateless control plane, horizontal scaling

Explore all features in detail →

Built With AgentField

Autonomous Engineering Team
Autonomous Engineering Team
One API call spins up PM, architect, coders, QA, reviewers - hundreds of coordinated agents that plan, build, test, and ship.

View project →
Deep Research Engine
Deep Research Engine
Recursive research backend. Spawns parallel agents, evaluates quality, generates deeper agents, and recurses -10,000+ agents per query.

View project →
Reactive MongoDB Intelligence
Reactive MongoDB Intelligence
Atlas Triggers + agent reasoning. Documents arrive raw and leave enriched - risk scores, pattern detection, evidence chains.

View project →
Autonomous Security Audit
Autonomous Security Audit
250 coordinated agents trace every vulnerability source-to-sink and adversarially verify each finding. Confirmed exploits, not pattern flags.

View project →
CloudSecurity AF
CloudSecurity AF
AI-native cloud infrastructure security scanner that performs shift-left attack path analysis directly from IaC, prioritizing the most dangerous risk chains before deployment.

View project →
Agentic PR Reviewer
Agentic PR Reviewer
Builds a custom review strategy for every PR - spawns parallel reviewer agents with runtime-crafted prompts, adversarially challenges its own findings, and posts evidence-grounded inline comments.

View project →

See all examples →

Built something with AgentField? Submit your project to be featured on the examples page.

Architecture

AgentField Architecture

The control plane is a stateless Go service. Agents connect from anywhere - your laptop, Docker, Kubernetes. They register capabilities, the control plane routes calls between them, tracks execution as DAGs, and enforces policies. Full architecture docs →

Learn More

The thinking behind AgentField - essays on AI backends, harness orchestration, and the infrastructure production agents actually need.

What is harness orchestration?
What is harness orchestration?
The atomic unit of intelligence is climbing from the model call to the autonomous harness - and what changes when it does.

Read post →
Part 1: The Black Box
Part 1: The Black Box
Treating harnesses like Claude Code and Codex as autonomous, embodied, persistent computational entities.

Read post →
Part 2: Engineering the Membrane
Part 2: Engineering the Membrane
Shaping the boundary surface of a harness across four engineerable dimensions: workspace, drift, verifier placement, and recovery budget.

Read post →
The AI Backend
The AI Backend
Our thesis: in five years every serious software company will run an AI backend - a reasoning layer that makes the decisions that used to be hardcoded.

Read post →
Fan out 1,000 parallel agents from one request
A tutorial on turning a single call into a bounded fan-out across the control plane, with queues, retries, and traces on every branch.

Read post →
Claude Code as a function
Wrap a multi-turn coding harness behind a REST endpoint and call Claude Code the same way you call any other agent.

Read post →

Documentation

Community

License

Apache 2.0