This file provides guidance to AI coding agents when working with code in this repository.
Agentica is a Python framework for building AI agents with support for multi-model LLMs, tools, multi-turn conversations, RAG, workflows, MCP integration, and a Skills system.
pip install -U agentica # From PyPI
pip install . # From source (development)python -m pytest tests/ # Run all tests
python -m pytest tests/test_agent.py # Run single test file
python -m pytest tests/test_agent.py::TestAgentInitialization # Run specific test class
python -m pytest tests/test_agent.py -k "test_default" # Run tests matching patternagentica # Interactive mode
agentica --query "Your question" # Single query
agentica --model_provider zhipuai --model_name glm-4.7-flash # Specify model
agentica --tools calculator shell wikipedia # Enable specific tools| File | Purpose |
|---|---|
agent.py |
Core Agent class - main entry point for building agents |
deep_agent.py |
DeepAgent - Agent with built-in file/execute/search tools |
tools/buildin_tools.py |
Built-in tools for DeepAgent (ls, read_file, execute, web_search, etc.) |
workspace.py |
Workspace - Agent workspace with AGENT.md, PERSONA.md, memory management |
memory/ |
AgentMemory, MemoryManager, WorkspaceMemorySearch - session and long-term memory |
guardrails.py |
Input/output validation and filtering for agents |
workflow.py |
Workflow engine for multi-step task orchestration |
cli/ |
Interactive CLI with file references (@filename) and commands (/help) |
run_response.py |
RunResponse, RunEvent - agent execution responses |
prompts/ |
Centralized prompt templates (base prompts, model-specific optimizations) |
| File | Purpose |
|---|---|
base.py |
Abstract Model base class |
message.py |
Message class for conversation messages |
response.py |
ModelResponse for LLM responses |
openai/ |
OpenAI API integration (GPT-4, GPT-3.5, etc.) |
anthropic/ |
Anthropic API integration (Claude models) |
zhipuai/ |
ZhipuAI API integration (GLM models) |
deepseek/ |
DeepSeek API integration |
ollama/ |
Ollama local model integration |
| File | Purpose |
|---|---|
base.py |
Tool base class and tool registration |
registry.py |
Global tool registry and discovery |
calculator.py |
Math calculations |
shell.py |
Shell command execution |
file_tools.py |
File system operations |
web_search.py |
Web search capabilities |
wikipedia.py |
Wikipedia search |
python_repl.py |
Python code execution |
NEW: ACP support for IDE integration (Zed, JetBrains, etc.)
| File | Purpose |
|---|---|
server.py |
ACPServer - Main ACP server for IDE integration |
protocol.py |
JSON-RPC protocol handler |
handlers.py |
ACP method handlers (initialize, tools/list, tools/call, etc.) |
types.py |
ACP data models (ACPRequest, ACPResponse, ACPTool, etc.) |
Usage:
agentica acp # Start ACP server modeIDE Configuration:
{
"agent_servers": {
"Agentica": {
"command": "agentica",
"args": ["acp"],
"env": {"OPENAI_API_KEY": "..."}
}
}
}- 问题:Phase 1 中
tool_call_hook产生的 repetitive behavior warning 直接 append 到function_call_results,排在 tool results 之前,破坏了 OpenAI API 要求的assistant(tool_calls) → tool必需配对序列 - 解决方案:引入
deferred_warnings列表,Phase 1 中收集 warnings,Phase 3 所有 tool results 处理完后再 extend 到function_call_results - 修改文件:
agentica/model/base.py(run_function_calls方法)
- 问题:
role: "system"的中间消息在部分 LLM API 中不兼容(某些模型不支持 system 出现在非首位,或不允许 system 紧接 assistant) - 解决方案:将以下 5 处
Message(role="system", ...)统一改为Message(role="user", ...):model/base.pyPhase 0:force_answer消息model/base.pyPhase 1:repetitive_behavior/force_strategy_changewarningsdeep_agent.pypost_tool_hook:step_reflectionpromptdeep_agent.pypost_tool_hook:iteration_checkpointprompt
- 修改文件:
agentica/model/base.py、agentica/deep_agent.py
[assistant] (tool_calls: [call_1, call_2])
[tool] tool result for call_1 ← tool results 紧跟 assistant
[tool] tool result for call_2
[user] Repetitive Behavior Warning ← deferred,排在 tool results 后
[user] Step Reflection ← post_tool_hook 注入
[user] Iteration Checkpoint ← post_tool_hook 注入
- 问题:
tool_call_limit检查在for循环内部,一个 assistant message 含多个 tool calls 时,达到 limit 就 break,后续 tool call 没有对应 result message → OpenAI API 400 错误 - 解决方案:将
tool_call_limit检查移到循环外,确保当前批次所有 tool calls 都处理完后再判断 - 修改文件:
agentica/model/base.py
- 问题:parent 和 subagent 共享同一个
Model实例,导致多种共享可变状态问题(hooks 闭包引用 parent DeepAgent、HTTP client 含 RLock 不可深拷贝、function_call_stack 累计等) - 解决方案:
tools/buildin_tools.pysubagent 创建时用model_copy()浅拷贝并逐个重置运行时字段(tools, functions, function_call_stack, tool_choice, metrics, client, http_client, _pre_tool_hook, _tool_call_hook, _post_tool_hook, _current_messages) - 修改文件:
agentica/tools/buildin_tools.py
涉及文件:
| 文件路径 | 修改类型 |
|---|---|
agentica/model/base.py |
修改(deferred warnings + role 改 user) |
agentica/deep_agent.py |
修改(post_tool_hook role 改 user) |
agentica/tools/buildin_tools.py |
修改(subagent model 浅拷贝 + 状态重置) |
- 变更:删除
__init_subclass__+functools类级方法替换,改为__init__中实例级 run 包装 - 实现:用
object.__setattr__绕过 Pydantic__setattr__,在实例上设置_user_run和包装后的run - 拆分方法:
_prepare_run、_process_result、_annotate_response、_finalize_run,更清晰 - 修改文件:
agentica/workflow.py、tests/test_workflow.py
- 直接删除
run_workflow()方法,不保留旧代码兼容 - 修改文件:
agentica/workflow.py、tests/test_workflow.py
- 删除
agentica/reasoning.py(ReasoningStep/ReasoningSteps/NextAction均未使用) - 移除
run_response.py中from agentica.reasoning import ReasoningStep导入 - 移除
RunResponseExtraData中reasoning_steps和reasoning_messages字段 - 保留 所有
reasoning_content相关代码(现代推理模型在用) - 修改文件:
agentica/run_response.py,删除agentica/reasoning.py
agentica/memory.py→agentica/memory/包(agent_memory.py、manager.py、models.py、search.py、summarizer.py、workflow.py)agentica/cli.py→agentica/cli/包(main.py、config.py、display.py、interactive.py)
- 删除
01_simple_workflow.py、03_news_article.py、04_novel_writing.py - 新增
01_data_pipeline.py、03_news_report.py、04_code_review.py
- 分析了
run_function_calls/arun_function_calls串行执行问题 - 推荐两步走方案:先 ThreadPool 并行(1 文件 60 行),后补齐 Async 体系
- 方案写入
update_tech_v3.md(v3.2)
涉及文件:
| 文件路径 | 修改类型 |
|---|---|
agentica/workflow.py |
修改(重构 run 包装机制) |
agentica/run_response.py |
修改(移除 reasoning 导入和字段) |
agentica/reasoning.py |
删除 |
agentica/memory.py → agentica/memory/ |
拆分为包 |
agentica/cli.py → agentica/cli/ |
拆分为包 |
tests/test_workflow.py |
修改 |
tests/test_memory.py |
修改 |
examples/workflow/ |
新增/删除多个示例 |
update_tech_v3.md |
追加并行工具调用方案 |
- 问题:subagent 执行时,主 agent 的 CLI 端完全无法感知子代理内部进度,只显示一个阻塞的
task工具调用 - 解决方案:
tools/buildin_tools.py的task()方法改为流式调用subagent.run(stream=True, stream_intermediate_steps=True),遍历子代理事件收集工具使用信息- 返回 JSON 新增
tool_calls_summary(工具名+简要信息列表)、execution_time(耗时秒)、tool_count - 新增
_format_tool_brief()静态方法:针对不同工具类型生成可读简要信息 cli.py新增_display_task_result()方法,对task工具做特殊展示:内部工具调用列表 +Execution Summary: N tool uses, cost: X.Xs
- 修改文件:
agentica/tools/buildin_tools.py、agentica/cli.py
- 问题:
BuiltinTaskTool的 system prompt 中包含负面示例DO NOT use XML-style tags like <tool_call>task<arg_key>...</arg_key></tool_call>,弱模型(如 GLM-4-flash)反而会模仿这个 XML 格式来调用工具 - 解决方案:删除负面示例,替换为简洁的正面指引
Use your standard function calling mechanism to invoke task(...) - 修改文件:
agentica/tools/buildin_tools.py(TASK_SYSTEM_PROMPT_TEMPLATE)
- 问题:多轮工具调用场景中,工具调用结果、思考过程与最终回答之间没有换行分隔,内容挤在一起
- 解决方案:在
StreamDisplayManager的end_tool_section()和end_thinking()结束时重置response_started = False,使下一段 content 输出时重新触发start_response()加空行分隔 - 修改文件:
agentica/cli.py
- 问题:Ctrl+C 只中断了流迭代,agent 内部仍在运行
- 解决方案:
KeyboardInterrupt捕获时调用current_agent.cancel(),通过协作式取消机制通知 agent 在下一个检查点抛出AgentCancelledError;同时捕获AgentCancelledError防止异常向上传播 - 修改文件:
agentica/cli.py(导入AgentCancelledError,修改异常处理逻辑)
- 问题:工具执行完成后
ToolCallCompleted事件被直接continue跳过,不显示任何结果 - 解决方案:新增
StreamDisplayManager.display_tool_result()方法,在每个工具调用下方显示执行结果预览- 使用
⎿连接符(Claude Code 风格),暗色 dim 显示 - 最多显示 4 行,每行最多 120 字符,超出部分显示
... (N more lines) - 错误结果用
dim red+⚠标记
- 使用
- 修改文件:
agentica/cli.py(新增display_tool_result方法,修改ToolCallCompleted事件处理)
- 问题:
DeepAgent默认enable_multi_round=False时,工具执行走 Model 层内置递归循环。aresponse/aresponse_stream内部调用同步的handle_tool_calls→run_function_calls→function_call.execute()→subprocess.run(),直接阻塞 asyncio 事件循环,导致 uvicorn 所有请求被阻塞 - 解决方案:新增完整异步工具调用链路
model/base.py:新增arun_function_callsasync 方法,用await function_call.aexecute()代替同步execute()model/openai/chat.py:新增ahandle_tool_calls和ahandle_stream_tool_callsasync 方法aresponse改用await self.ahandle_tool_calls()aresponse_stream改用async for ... in self.ahandle_stream_tool_calls()
调用链对照:
| 层级 | 同步路径 | Async 路径(新增) |
|---|---|---|
| Model base | run_function_calls → execute() |
arun_function_calls → await aexecute() |
| OpenAI chat | handle_tool_calls |
ahandle_tool_calls |
| OpenAI chat | handle_stream_tool_calls |
ahandle_stream_tool_calls |
| OpenAI chat | response → handle_tool_calls |
aresponse → await ahandle_tool_calls |
| OpenAI chat | response_stream → handle_stream_tool_calls |
aresponse_stream → async for ahandle_stream_tool_calls |
修改文件:
agentica/model/base.py:新增arun_function_callsagentica/model/openai/chat.py:新增ahandle_tool_calls、ahandle_stream_tool_calls,修改aresponse、aresponse_stream
建立了Agentica框架的基础架构,包括Agent基类、模型集成、工具系统和CLI界面。
- Agent基类设计和实现
- 多模型LLM支持(OpenAI, Anthropic, ZhipuAI等)
- 工具系统和注册机制
- 内存管理和会话持久化
- CLI交互界面
- 文件操作工具
- Shell命令执行
- Web搜索集成
- Python REPL
- 计算器功能
- 单元测试框架
- 集成测试用例
- 性能基准测试
This file provides guidance to AI coding agents when working with code in this repository.
Agentica is a Python AI agent framework for building, managing, and deploying autonomous AI agents. It supports multi-agent teams, workflows, RAG, MCP tools, and a file-based workspace memory system. The project uses an async-first architecture where all core methods are natively async, with _sync() adapters for synchronous callers.
Python >= 3.12 required.
# Install dependencies
pip install -r requirements.txt
pip install -e .
# Run all tests
python -m pytest tests/ -v --tb=short
# Run a single test file
python -m pytest tests/test_agent.py -v
# Run a single test case
python -m pytest tests/test_agent.py::TestAgentInitialization::test_default_initialization -v
# CLI entry point
agenticaNo Makefile, linter config, or type-checker is configured in this repo. CI runs pytest on Python 3.12.
All core methods (run, response, execute, invoke) are async by default. The naming convention:
| Method | Type | Purpose |
|---|---|---|
run() |
async | Non-streaming run |
run_stream() |
async generator | Streaming run |
run_sync() |
sync adapter | Wraps run() via run_sync() utility |
run_stream_sync() |
sync iterator | Background thread + queue pattern |
There are no a-prefixed methods (arun, aresponse, etc.) — those were removed. Sync tools are executed via loop.run_in_executor() inside async context. Sync DB calls (session read/write) and file I/O (workspace) are wrapped in loop.run_in_executor() to avoid blocking the event loop.
The Agent class uses @dataclass(init=False) with explicit __init__ and direct multiple inheritance from mixin classes. Execution is delegated to an independent Runner class.
@dataclass(init=False)
class Agent(PromptsMixin, TeamMixin, ToolsMixin, PrinterMixin):
def __init__(self, ...):
...
self._runner = Runner(self)
async def run(self, message, **kw) -> RunResponse:
return await self._runner.run(message, **kw)runner.py—Runner: Independent execution engine (holdsself.agentreference). Core_run_impl(),run(),run_stream(),run_sync(),run_stream_sync()prompts.py—PromptsMixin: System/user message constructionteam.py—TeamMixin: Multi-agent delegationtools.py—ToolsMixin: Tool registration and managementprinter.py—PrinterMixin: Response printing utilities
Note: Agent uses @dataclass (not BaseModel). The entire Model hierarchy also uses @dataclass (converted from Pydantic in v3).
Model base class (base.py) is a @dataclass with ABC, defining abstract async methods: invoke(), invoke_stream(), response(), response_stream(). All Model subclasses use @dataclass (not Pydantic BaseModel).
Core providers (5 directories): openai/, anthropic/, ollama/, litellm/, azure/
OpenAI-compatible providers via registry factory (model/providers.py):
from agentica.model.providers import create_provider
model = create_provider("deepseek", api_key="...") # Returns OpenAILike with correct configRegistered providers: deepseek, qwen, zhipuai, moonshot, doubao, together, xai, yi, nvidia, sambanova, groq, cerebras, mistral
Structured output: Each provider implements native structured output:
- OpenAI:
beta.chat.completions.parsewithresponse_format - Claude: synthetic tool_use mode
- LiteLLM:
response_format={"type": "json_schema", ...} - Ollama:
format=schema
Three-layer hierarchy: Tool (container) → Function (schema + entrypoint) → FunctionCall (invocation). FunctionCall.execute() is async-only.
@tool decorator (tools/decorators.py): Attach metadata to functions for auto-registration:
from agentica.tools import tool
@tool(name="add", description="Add two numbers")
def add(a: int, b: int) -> int:
return a + bGlobal tool registry (tools/registry.py): register_tool(), get_tool(), list_tools(), unregister_tool(), clear_registry()
Function.from_callable() auto-detects _tool_metadata from @tool decorator.
Builtin Tools (agentica/tools/buildin_tools.py) — all I/O-bound tools are async:
edit_file(file_path, old_string, new_string, replace_all=False)— flat params for LLM-friendly schema (no nested dict/list)read_file— async withaiofilesstreaming readwrite_file— async with atomic write (tempfile+os.replace)ls,glob— async viarun_in_executorgrep— async ripgrep (rg) subprocess with pure-Python fallbackexecute— async subprocess with graceful SIGTERM→SIGKILL terminationweb_search,fetch_url— async wrappers around async backendstask— asyncrun_stream()to subagentwrite_todos,read_todos— sync (pure CPU, no I/O)
Flow control exceptions: StopAgentRun, RetryAgentRun.
Three-layer unified architecture:
core.py:GuardrailTriggered(base exception),GuardrailOutput(allow/block),BaseGuardrail(base class with_invoke()),run_guardrails_seq()(execution engine)agent.py:InputGuardrail,OutputGuardrail— validate entire agent runs.@input_guardrail,@output_guardraildecoratorstool.py:ToolInputGuardrail,ToolOutputGuardrail— validate individual tool calls with three-way behavior (allow/reject_content/raise_exception)
All guardrail functions support both sync and async. GuardrailFunctionOutput is a backward-compatible alias for GuardrailOutput.
Two-tier system:
- Runtime:
AgentMemory(memory/agent_memory.py) — in-memory conversation history with token-aware truncation - Persistent:
Workspace(workspace.py) — file-based storage using Markdown files (AGENT.md,PERSONA.md,TOOLS.md,USER.md,MEMORY.md) with multi-user isolation underusers/{user_id}/.get_context_prompt(),get_memory_prompt(),write_memory(),save_memory()are async (file I/O viarun_in_executor).initialize(),read_file(),write_file(),append_file()remain sync for init-time use.
Modular system prompt assembly via PromptBuilder (builder.py). Components:
base/— Core prompt modules (soul, tools, heartbeat, task_management, self_verification, deep_agent)- Each module has a
.pyloader and an.mdtemplate inbase/md/ - Model-agnostic design — removed model-specific prompt optimizations
- No compact modes — single streamlined prompt per module
- Identity handled directly in builder (no separate identity module)
- Shared
load_prompt()utility inbase/utils.py(DRY)
Deterministic multi-agent pipeline. run() is async; subclasses override it for step orchestration. run_sync() provided as adapter. Session storage (read_from_storage, write_to_storage, load_session) are async with DB calls wrapped in run_in_executor.
Uses lazy loading with thread-safe double-checked locking (threading.Lock) for optional/heavy modules (database backends, knowledge, vector DBs, embeddings, MCP, ACP, guardrails). Core modules (Agent, Model, Memory, Tools, Workspace) are imported eagerly.
@dataclass(init=False)with explicit__init__for Agent, direct multiple inheritance from mixins- Pydantic
BaseModelfor data structures (Model, Tool, Function, RunResponse, ToolCallInfo, Workflow) Function.from_callable()factory to auto-generate tool definitions from Python functions- Token-aware message history via
AgentMemory.get_messages_from_last_n_runs() RunResponsewithRunEventenum for structured event streamingRunResponse.tool_calls→List[ToolCallInfo]for flat attribute access (no nested.get()chains)RunResponse.tool_call_times→Dict[str, float]one-liner per-tool timing- Langfuse integration for tracing (context manager pattern in runner)
@overridedecorator (Python 3.12) on model provider method overridesasyncio.TaskGroupfor structured concurrent tool execution (notasyncio.gather)- All blocking I/O (DB, file system) wrapped in
loop.run_in_executor()within async methods - Tests for async methods use
asyncio.run()andunittest.mock.AsyncMock
Eight-phase refactoring to simplify and modernize the codebase:
| Phase | Description | Key Changes |
|---|---|---|
| 1 | Model layer simplification | Remove 19 provider dirs → providers.py registry/factory |
| 2 | Model @dataclass conversion | Pydantic BaseModel → stdlib @dataclass for all Model classes |
| 3 | Async consistency + structured output | ABC/@abstractmethod, unified structured output for all providers |
| 4 | Tool registration mechanism | @tool decorator + global tool registry |
| 5 | Runner extraction | RunnerMixin → independent Runner class, Agent delegates via _runner |
| 6 | Guardrails unification | New core.py abstraction layer, base.py → agent.py |
| 7 | __init__.py simplification |
594 → 399 lines, streamlined lazy loading |
| 8 | Tests + cleanup | 35 new tests covering all phases, CLAUDE.md update |
Test result: 622 tests pass (587 original + 35 new v3 tests)
Testing convention: All tests MUST mock LLM API keys (use api_key="fake_openai_key" or mock agent._runner.run). No real API calls in tests.
Fixed 6 example files to align with V2 async-first API:
- Agent no longer accepts
db=— passdbtoAgentMemory(db=...)instead, then setagent.memory - Agent no longer accepts
load_workspace_context=/load_workspace_memory=— these moved toMemoryConfig(enabled by default) - Agent no longer accepts
workspace_path=/user_id=— useWorkspace(path, user_id=...)object and pass viaworkspace= - Agent no longer has
compression_managerattribute — access viaagent.tool_config.compress_tool_results; passCompressionManagerthroughToolConfig(compression_manager=...) - Agent no longer has
get_user_memories()/clear_user_memories()— useagent.memory.load_user_memories()/agent.memory.clear() - Agent no longer has
load_session()method — session management removed from Agent direct API - DeepAgent no longer accepts
db=— same pattern, useAgentMemory