You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+14Lines changed: 14 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,20 @@
2
2
3
3
All notable changes to this project will be documented in this file.
4
4
5
+
## [Unreleased]
6
+
7
+
### Changed
8
+
9
+
-**Documentation sync**: Updated all documentation files to match current implementation exactly
10
+
-`docs/architecture/retrieval.md`: Fixed embedder model (Qwen3-Embedding-8B via remote EPFL endpoint, not local BGE-M3); fixed reranker (BAAI/bge-reranker-v2-m3 remote)
11
+
-`docs/architecture/agent.md`: Fixed default model (`openai/gpt-oss-120b` EPFL, not `gpt-4o-mini`); fixed tool names (`repo_info_batch`, not `repo_info`); updated tool caps and agent setup code
12
+
-`docs/architecture/catalog.md`: Added full GraphDB SPARQL sync flow, mermaid diagram, all required environment variables
13
+
-`docs/reference/environment.md`: Complete rewrite — comprehensive table of 27+ env vars including EPFL keys, GRAPHDB_* vars, SYNC_* vars, EMBED_CATALOG_ON_START, AGENT_CACHE_MAX, AGENT_OUTPUT_RETRIES, IMAGE_META_CACHE_MAX, DEBUG
14
+
-`docs/getting-started/configuration.md`: Fixed default model and config.yaml example to match actual defaults; added EPFL API key docs and GraphDB sync vars; added local retrieval instructions
15
+
-`docs/development/structure.md`: Added `core/` module, `queries/` directory, updated `agent/tools/` listing with all tool files and mcp/ subdir; fixed generator schema models; improved retriever pipeline description
16
+
-`docs/reference/cli.md`: Updated `ai_agent sync` to describe actual SPARQL/GraphDB mechanism and required env vars
17
+
-`docs/index.md`: Updated retrieval stack description to Qwen3-Embedding-8B + BGE-M3
Copy file name to clipboardExpand all lines: docs/architecture/agent.md
+70-83Lines changed: 70 additions & 83 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,21 +45,28 @@ graph TB
45
45
46
46
### Agent Definition
47
47
48
+
The agent model is determined by `config.yaml` (`agent_model` section). When a `base_url` is provided, `OpenAIChatModel` (chat/completions API) is used; otherwise `OpenAIResponsesModel` (Responses API) is used:
49
+
48
50
```python
49
51
from pydantic_ai import Agent
50
-
from pydantic_ai.models.openai import OpenAIResponsesModel
52
+
from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIChatModel
51
53
from pydantic_ai.providers.openai import OpenAIProvider
52
54
from ai_agent.generator.prompts import get_agent_system_prompt
53
-
from ai_agent.generator.schema import ToolSelection
output_retries=3, # configurable via AGENT_OUTPUT_RETRIES
63
70
)
64
71
```
65
72
@@ -68,7 +75,11 @@ agent = Agent(
68
75
-`model`: VLM model to use (configurable via `config.yaml`)
69
76
-`system_prompt`: Agent role, scoring rules, and output format (from `generator/prompts.py`)
70
77
-`deps_type`: `AgentState` — tracks tool calls, quotas, and session overrides
71
-
-`output_type`: `ToolSelection` — passed to `agent.run_sync()` to enforce structured JSON output (not set on the `Agent` constructor)
78
+
-`output_retries`: Number of times the agent retries if output validation fails (env `AGENT_OUTPUT_RETRIES`, default `3`)
79
+
-`output_type`: `ToolSelection` — passed to `agent.run_sync()` to enforce structured JSON output
80
+
81
+
!!! note "Agent cache"
82
+
A bounded LRU cache of agent instances (max size `AGENT_CACHE_MAX`, default 16) avoids rebuilding the provider/model objects on every request when the same custom endpoint + model combination is used repeatedly.
72
83
73
84
### Conversation State
74
85
@@ -98,111 +109,87 @@ class AgentState(BaseModel):
98
109
99
110
## Agent Tools
100
111
101
-
Tools extend agent capabilities beyond chat:
112
+
The agent has three registered tools, each with a hard per-run call limit enforced by the `@limit_tool_calls` decorator:
113
+
114
+
| Tool | Cap | Description |
115
+
|------|-----|-------------|
116
+
|`search_tools`| 1 | Initial semantic search — called exactly once per run |
117
+
|`search_alternative`| 3 | Alternative query formulation for broader/different search |
Execute Gradio Space demos (optional, experimental):
184
-
185
-
```python
186
-
@agent.tool(retries=0, prepare=cap_prepare)
187
-
@limit_tool_calls("run_example", cap=1)
188
-
asyncdefrun_example(
189
-
ctx: RunContext[AgentState],
190
-
tool_name: str,
191
-
endpoint_url: str|None=None,
192
-
extra_text: str|None=None,
193
-
) -> dict:
194
-
"""Run an example / demo for a given tool via its Gradio space."""
195
-
196
-
out = tool_run_example(RunExampleInput(
197
-
tool_name=tool_name,
198
-
endpoint_url=endpoint_url,
199
-
extra_text=extra_text,
200
-
))
201
-
return out.model_dump(mode="python")
202
-
```
203
-
204
-
**Status**: Partially implemented, limited to specific demo formats.
205
-
206
193
## Selection and Ranking
207
194
208
195
The PydanticAI agent performs tool selection and ranking directly as part of its LLM reasoning step. There is no separate `VLMToolSelector` class — the agent's system prompt (defined in `generator/prompts.py`) encodes the scoring rules, and the `ToolSelection` Pydantic schema (defined in `generator/schema.py`) enforces structured output.
@@ -212,7 +199,7 @@ The PydanticAI agent performs tool selection and ranking directly as part of its
212
199
The agent system prompt is assembled by `get_agent_system_prompt()` in `generator/prompts.py` and covers:
213
200
214
201
-**Image analysis**: Instructions to analyze the attached preview image and reference visual observations in explanations
215
-
-**Tool call sequence**: When to call `search_tools`, `search_alternative`, `repo_info`, and `run_example`
202
+
-**Tool call sequence**: When to call `search_tools`, `search_alternative`, `repo_info_batch`
216
203
-**Scoring rules**: Accuracy (0–100) = Task match (40) + Format compatibility (30) + Features (30)
217
204
-**Output format**: Single JSON object matching the `ToolSelection` schema
0 commit comments