Skip to content

Commit f95fb36

Browse files
authored
Merge pull request #14 from Imaging-Plaza/feature/interface-refactor
Feature/interface refactor
2 parents dbd3bb7 + 143852c commit f95fb36

22 files changed

Lines changed: 1567 additions & 805 deletions

CHANGELOG.md

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,34 @@ All notable changes to this project will be documented in this file.
55
## [Unreleased]
66

77
### Added
8-
- **DeepWiki MCP integration**: Repository info tool now uses DeepWiki MCP server (https://mcp.deepwiki.com/sse) as primary source for GitHub repository documentation. DeepWiki provides fast, pre-indexed documentation access without API rate limits.
9-
- Automatic fallback to `repocards` library (replacing previous direct GitHub API implementation) when DeepWiki is unavailable or times out, ensuring robust repository information retrieval for both indexed and newly-created repositories.
8+
- **New chat-based interface** (`ai_agent chat`) with conversational AI assistant
9+
- Chatbot component with rich media rendering (images, files, JSON, code blocks)
10+
- Inline file upload support for PNG, JPG, WEBP, TIFF, DICOM, NIfTI, CSV, JSON, XML, MP3, MP4
11+
- File previews with format-specific icons rendered in chat messages
12+
- Tool recommendation cards with detailed metadata (modality, dimensions, license, tags)
13+
- Demo execution as conversational flow - assistant asks "Would you like me to run the demo?"
14+
- Tool execution traces displayed as collapsible `<details>` sections after responses
15+
- Debug sidebar showing conversation state, excluded tools, and preview images
16+
- Full conversation context maintained across multi-turn interactions
17+
- Affirmative response detection for demo confirmations (yes, sure, ok, etc.)
18+
- `respond(message, files, state) -> (reply, media, state)` core interface function
19+
- Encapsulates all agent logic in testable, UI-independent function
20+
- State management via `ChatState` dataclass with serialization
21+
- `ChatMessage` dataclass for rich reply composition with markdown, images, files, traces
22+
- `handlers.py` module with agent response logic
23+
- `components.py` module for reusable chat UI components
24+
- `formatters.py` helpers for rich message and media formatting
25+
- `state.py` chat state models and serialization utilities
26+
- `visualizations.py` helpers for rendering previews, traces, and visual state
27+
- `app.py` Gradio app implementing the chat UI
28+
- **Imaging Plaza branding**: Custom CSS theme with Plaza green colors (#00A991)
29+
- **Logo integration**: Official Imaging Plaza white logo displayed in header
30+
- **Redesigned layout**: Reorganized UI with header banner, left chat panel, and right sidebar for files and state
1031

1132
### Changed
33+
- CLI now supports `ai_agent chat`
34+
- **DeepWiki MCP integration**: Repository info tool now uses DeepWiki MCP server (https://mcp.deepwiki.com/sse) as primary source for GitHub repository documentation. DeepWiki provides fast, pre-indexed documentation access without API rate limits.
35+
- Automatic fallback to `repocards` library (replacing previous direct GitHub API implementation) when DeepWiki is unavailable or times out, ensuring robust repository information retrieval for both indexed and newly-created repositories.
1236
- Updated `pydantic-ai` dependency to include MCP support via `pydantic-ai[mcp]` extra.
1337
- Enhanced `RepoSummaryOutput` schema to include `source` field indicating whether data came from "deepwiki" or "repocards".
1438
- Repository info tool logs now track data source (DeepWiki vs repocards) for observability.
@@ -23,6 +47,9 @@ All notable changes to this project will be documented in this file.
2347
- **UI State Management Simplified**: Removed complex refine intent detection system. Agent now naturally handles requests for alternatives via conversation history without hard-coded heuristics.
2448
- **UI Handler Simplified**: Reduced `handle_message()` parameters from 8 to 6, removing `last_task_state`, `last_suggestions_state`, and `excluded_names` state tracking.
2549
- **Agent-Only Path**: Removed `USE_AGENT` conditional (always uses Pydantic AI agent). Deleted dead code path for non-agent pipeline invocation.
50+
- **UI redesign**: File upload moved to dedicated right panel for cleaner workflow
51+
- **Visual hierarchy**: Header with gradient green banner and logo
52+
- **Button styling**: Primary actions use Imaging Plaza green theme colors
2653

2754
### Removed
2855
- **VLMToolSelector**: Deleted unused `generator/generator.py` containing VLMToolSelector class. The pydantic-ai agent handles all tool selection directly.
@@ -32,6 +59,7 @@ All notable changes to this project will be documented in this file.
3259
- **Legacy Method**: Removed `recommend_and_link()` method from `api/pipeline.py` (~180 lines) - only used by outdated tests, replaced by agent-based approach.
3360
- **State Variables**: Removed 3 Gradio State objects: `last_task_state`, `last_suggestions_state`, `excluded_names`.
3461
- **Outdated Tests**: Removed `tests/full_test.py` which only tested the removed `recommend_and_link()` method.
62+
- CLI no more supports `ai_agent ui` command
3563

3664
### Fixed
3765
- **Conversation Context**: Agent now properly maintains conversation history, enabling natural understanding of follow-up requests like "show me alternatives".

pyproject.toml

Lines changed: 19 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -6,25 +6,26 @@ readme = "README.md"
66
requires-python = ">=3.10"
77

88
dependencies = [
9-
"faiss-cpu",
10-
"numpy",
11-
"pydantic>=2",
12-
"sentence-transformers",
13-
"openai>=1.30.0",
14-
"pydantic-ai[mcp]",
15-
"requests",
16-
"python-dotenv",
9+
"faiss-cpu==1.11.0.post1",
10+
"numpy==2.2.6",
11+
"pydantic==2.11.7",
12+
"sentence-transformers==5.1.0",
13+
"openai==2.1.0",
14+
"pydantic-ai[mcp]==1.0.14",
15+
"requests==2.32.4",
16+
"python-dotenv==1.1.1",
1717
"gradio==5.42.0",
18-
"json5",
19-
"pillow",
20-
"nibabel",
21-
"tifffile",
22-
"pydicom",
23-
"imageio",
24-
"rdflib",
25-
"sparqlwrapper",
26-
"repocards",
27-
"pyyaml",
18+
"json5==0.12.0",
19+
"pillow==11.3.0",
20+
"nibabel==5.3.2",
21+
"tifffile==2025.5.10",
22+
"pydicom==3.0.1",
23+
"imageio==2.37.0",
24+
"rdflib==7.4.0",
25+
"sparqlwrapper==2.0.0",
26+
"plotly==6.5.0",
27+
"repocards==0.1.2",
28+
"pyyaml==6.0.2",
2829
]
2930

3031
[project.scripts]

src/ai_agent/agent/agent.py

Lines changed: 137 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,18 @@
11
from __future__ import annotations
22

33
import os, logging
4+
from datetime import datetime
45
from typing import List
56
from pydantic_ai import Agent, RunContext
67
from pydantic_ai.usage import UsageLimits
78
from pydantic_ai.models.openai import OpenAIChatModel
89
from pydantic_ai.providers.openai import OpenAIProvider
910

10-
from generator.prompts import AGENT_SYSTEM_PROMPT
11-
from generator.schema import ToolSelection
12-
from api.pipeline import RAGImagingPipeline
13-
from utils.utils import _best_runnable_link
14-
from utils.config import get_config
11+
from ai_agent.generator.prompts import get_agent_system_prompt
12+
from ai_agent.generator.schema import ToolSelection
13+
from ai_agent.api.pipeline import RAGImagingPipeline
14+
from ai_agent.utils.utils import _best_runnable_link
15+
from ai_agent.utils.config import get_config
1516
from .models import AgentToolSelection, ToolRunLog
1617
from .tools.repo_info_tool import tool_repo_summary, RepoSummaryInput
1718
from .tools.rerank_tool import tool_rerank, RerankInput
@@ -53,7 +54,7 @@
5354

5455
agent = Agent(
5556
model=openai_model,
56-
system_prompt=AGENT_SYSTEM_PROMPT,
57+
system_prompt=get_agent_system_prompt(os.getenv("NUM_CHOICES", "3")),
5758
deps_type=AgentState,
5859
)
5960

@@ -64,16 +65,18 @@
6465
async def search_tools(ctx: RunContext[AgentState], query: str, excluded: List[str] | None = None, top_k: int = 12, original_formats: List[str] | None = None):
6566
# Merge explicit excluded param with state's excluded_tools
6667
all_excluded = list(set((excluded or []) + ctx.deps.excluded_tools))
67-
out = tool_search_tools(SearchToolsInput(query=query, excluded=all_excluded, top_k=top_k, original_formats=original_formats or []))
68+
# Use override from context if available
69+
effective_top_k = ctx.deps.override_top_k if ctx.deps.override_top_k is not None else top_k
70+
out = tool_search_tools(SearchToolsInput(query=query, excluded=all_excluded, top_k=effective_top_k, original_formats=original_formats or []))
6871
payload = [c.model_dump(mode="python") for c in out.candidates]
69-
ctx.deps.tool_calls.append({"tool": "search_tools", "query": query, "count": len(payload), "original_formats": original_formats or [], "excluded": all_excluded})
72+
ctx.deps.tool_calls.append({"tool": "search_tools", "query": query, "count": len(payload), "original_formats": original_formats or [], "excluded": all_excluded, "timestamp": datetime.now().isoformat()})
7073
return payload
7174

7275
@agent.tool(retries=2, prepare=cap_prepare)
7376
@limit_tool_calls("rerank", cap=1)
7477
async def rerank(ctx: RunContext[AgentState], query: str, candidate_names: List[str], top_k: int = 5):
7578
out = tool_rerank(RerankInput(query=query, candidate_names=candidate_names, top_k=top_k))
76-
ctx.deps.tool_calls.append({"tool": "rerank", "query": query, "used_model": out.used_model, "count": len(out.reranked)})
79+
ctx.deps.tool_calls.append({"tool": "rerank", "query": query, "used_model": out.used_model, "count": len(out.reranked), "timestamp": datetime.now().isoformat()})
7780
return out.model_dump(mode="python")
7881

7982
# @agent.tool(retries=2, prepare=cap_prepare)
@@ -113,7 +116,7 @@ async def repo_info(ctx: RunContext[AgentState], url: str):
113116
"hint": "Pass a GitHub repo URL or 'owner/repo' to repo_info(url).",
114117
"original": url,
115118
}
116-
ctx.deps.tool_calls.append({"tool": "repo_info", "url": url, "skipped": True, "reason": "NON_GITHUB_URL"})
119+
ctx.deps.tool_calls.append({"tool": "repo_info", "url": url, "skipped": True, "reason": "NON_GITHUB_URL", "timestamp": datetime.now().isoformat()})
117120
return payload
118121

119122
try:
@@ -122,11 +125,12 @@ async def repo_info(ctx: RunContext[AgentState], url: str):
122125
"tool": "repo_info",
123126
"url": norm_url,
124127
"truncated": out.truncated,
125-
"source": out.source
128+
"source": out.source,
129+
"timestamp": datetime.now().isoformat()
126130
})
127131
return out.model_dump(mode="python")
128132
except Exception as e:
129-
ctx.deps.tool_calls.append({"tool": "repo_info", "url": norm_url, "error": str(e)})
133+
ctx.deps.tool_calls.append({"tool": "repo_info", "url": norm_url, "error": str(e), "timestamp": datetime.now().isoformat()})
130134
return {
131135
"invalid": True,
132136
"reason": "FETCH_FAILED",
@@ -146,14 +150,23 @@ async def resolve_demo_link(ctx: RunContext[AgentState], tool_name: str):
146150
link = _best_runnable_link(doc)
147151
except Exception:
148152
link = None
149-
ctx.deps.tool_calls.append({"tool": "resolve_demo_link", "tool_name": tool_name, "demo_link": link})
153+
ctx.deps.tool_calls.append({"tool": "resolve_demo_link", "tool_name": tool_name, "demo_link": link, "timestamp": datetime.now().isoformat()})
150154
return {"tool_name": tool_name, "demo_link": link}
151155

152156
# Runner wrapper ---------------------------------------------------------------
153157

154-
def run_agent(task: str, image_data_url: str | None = None, excluded: List[str] | None = None,
155-
original_formats: List[str] | None = None, image_meta: str | None = None,
156-
conversation_history: List[str] | None = None) -> AgentToolSelection:
158+
def run_agent(
159+
task: str,
160+
image_data_url: str | None = None,
161+
excluded: List[str] | None = None,
162+
original_formats: List[str] | None = None,
163+
image_meta: str | None = None,
164+
conversation_history: List[str] | None = None,
165+
model: str | None = None,
166+
base_url: str | None = None,
167+
top_k: int | None = None,
168+
num_choices: int | None = None,
169+
) -> AgentToolSelection:
157170
"""Execute the agent. We inline the image as extra context in user message (multimodal reasoning)."""
158171
extra_context = ""
159172
if image_data_url:
@@ -162,9 +175,15 @@ def run_agent(task: str, image_data_url: str | None = None, excluded: List[str]
162175

163176
tool_logs: List[ToolRunLog] = []
164177

165-
# Intercept tool usage by patching agent? Simpler: rely on return types (pydantic-ai tracks internally, we record manually not available yet) -> for Phase 1 we skip deep logging.
166-
167-
deps = AgentState(excluded_tools=excluded or [])
178+
# Create AgentState with runtime overrides
179+
deps = AgentState(
180+
excluded_tools=excluded or [],
181+
override_model=model,
182+
override_base_url=base_url,
183+
override_top_k=top_k,
184+
override_num_choices=num_choices,
185+
)
186+
168187
# Provide hidden metadata context lines (non-user-visible) below a delimiter
169188
hidden_meta = ""
170189
if original_formats:
@@ -174,6 +193,10 @@ def run_agent(task: str, image_data_url: str | None = None, excluded: List[str]
174193
short_meta = " ".join(x.strip() for x in image_meta.splitlines() if x.strip())
175194
hidden_meta += "\n(Image Metadata: " + short_meta[:500] + ("…" if len(short_meta) > 500 else "") + ")"
176195

196+
# Add top_k hint if specified (for UI settings)
197+
if top_k is not None:
198+
hidden_meta += f"\n(Search top_k: {top_k})"
199+
177200
# Build prompt with conversation history if this is a follow-up
178201
if conversation_history and len(conversation_history) > 0:
179202
# Format previous conversation for context
@@ -182,11 +205,104 @@ def run_agent(task: str, image_data_url: str | None = None, excluded: List[str]
182205
else:
183206
prompt = task + extra_context + hidden_meta
184207

185-
result = agent.run_sync(prompt, deps=deps, output_type=ToolSelection, usage_limits=UsageLimits(tool_calls_limit=10)).output
208+
# Determine which agent instance to use
209+
agent_instance = agent # Default to global agent
210+
effective_num_choices = num_choices if num_choices is not None else 3
211+
effective_model = model if model else agent_model_config.name
212+
effective_top_k = top_k if top_k is not None else 12
213+
214+
# When model is provided from UI, base_url comes with it (can be None for OpenAI)
215+
# When model is NOT provided, use config defaults
216+
if model:
217+
# Model selected from dropdown - base_url parameter is authoritative
218+
if base_url and "inference.rcp.epfl.ch" in base_url:
219+
# EPFL model selected
220+
runtime_api_key = os.getenv("EPFL_API_KEY")
221+
if not runtime_api_key:
222+
raise ValueError("EPFL_API_KEY not found. Cannot use EPFL models without VPN and API key.")
223+
effective_base_url = base_url
224+
log.info("✓ Using EPFL_API_KEY for EPFL inference server")
225+
else:
226+
# OpenAI or other model selected (base_url=None means OpenAI)
227+
runtime_api_key = os.getenv("OPENAI_API_KEY")
228+
if not runtime_api_key:
229+
raise ValueError("OPENAI_API_KEY not found. Cannot use OpenAI models.")
230+
effective_base_url = base_url # Will be None for OpenAI
231+
log.info("✓ Using OPENAI_API_KEY for OpenAI endpoint")
232+
else:
233+
# No model override - use config defaults
234+
effective_base_url = agent_model_config.base_url
235+
if effective_base_url and "inference.rcp.epfl.ch" in effective_base_url:
236+
runtime_api_key = os.getenv("EPFL_API_KEY")
237+
if not runtime_api_key:
238+
raise ValueError("EPFL_API_KEY not found")
239+
log.info("✓ Using EPFL_API_KEY from config")
240+
else:
241+
runtime_api_key = os.getenv("OPENAI_API_KEY")
242+
if not runtime_api_key:
243+
raise ValueError("OPENAI_API_KEY not found")
244+
log.info("✓ Using OPENAI_API_KEY from config")
245+
246+
# Log runtime configuration
247+
endpoint_display = effective_base_url if effective_base_url else "api.openai.com"
248+
log.info(
249+
f"🤖 Agent execution - Model: {effective_model}, endpoint: {endpoint_display}, "
250+
f"top_k: {effective_top_k}, num_choices: {effective_num_choices}, excluded: {len(excluded or [])}"
251+
)
252+
253+
# Create dynamic agent:
254+
needs_dynamic_agent = (
255+
(model and model != agent_model_config.name) or
256+
(base_url is not None and base_url != agent_model_config.base_url) or
257+
(runtime_api_key != api_key) # API key mismatch - need new agent!
258+
)
259+
260+
if needs_dynamic_agent:
261+
log.info(f"📦 Creating runtime agent with model={effective_model}, endpoint={effective_base_url or 'api.openai.com'}")
262+
263+
runtime_provider = OpenAIProvider(
264+
base_url=effective_base_url,
265+
api_key=runtime_api_key,
266+
)
267+
runtime_model = OpenAIChatModel(model_name=effective_model, provider=runtime_provider)
268+
agent_instance = Agent(
269+
model=runtime_model,
270+
system_prompt=get_agent_system_prompt(effective_num_choices),
271+
deps_type=AgentState,
272+
)
273+
# Register tools on the dynamic agent
274+
agent_instance.tool(search_tools, retries=2, prepare=cap_prepare)
275+
agent_instance.tool(rerank, retries=2, prepare=cap_prepare)
276+
agent_instance.tool(repo_info, retries=0, prepare=cap_prepare)
277+
agent_instance.tool(resolve_demo_link, retries=2, prepare=cap_prepare)
278+
elif num_choices is not None and num_choices != 3:
279+
# Model/base_url same but num_choices differs - create agent with updated prompt
280+
log.info(f"📦 Creating runtime agent with num_choices={effective_num_choices} (model: {effective_model})")
281+
agent_instance = Agent(
282+
model=openai_model,
283+
system_prompt=get_agent_system_prompt(effective_num_choices),
284+
deps_type=AgentState,
285+
)
286+
# Register tools on the dynamic agent
287+
agent_instance.tool(search_tools, retries=2, prepare=cap_prepare)
288+
agent_instance.tool(rerank, retries=2, prepare=cap_prepare)
289+
agent_instance.tool(repo_info, retries=0, prepare=cap_prepare)
290+
agent_instance.tool(resolve_demo_link, retries=2, prepare=cap_prepare)
291+
else:
292+
log.info(f"♻️ Using global agent (model: {effective_model}, num_choices: {effective_num_choices})")
293+
294+
log.debug(f"Prompt length: {len(prompt)} chars, has_image: {image_data_url is not None}")
295+
result = agent_instance.run_sync(prompt, deps=deps, output_type=ToolSelection, usage_limits=UsageLimits(tool_calls_limit=10)).output
296+
log.info(f"✅ Agent execution complete - choices returned: {len(result.choices)}")
186297

187298
# Convert tool call dicts into ToolRunLog entries
188299
for tc in deps.tool_calls:
189-
tool_logs.append(ToolRunLog(tool=tc.get("tool"), inputs={k: v for k, v in tc.items() if k not in {"tool"}}, summary=str(tc)))
300+
tool_logs.append(ToolRunLog(
301+
tool=tc.get("tool"),
302+
inputs={k: v for k, v in tc.items() if k not in {"tool", "timestamp"}},
303+
summary=str(tc),
304+
timestamp=tc.get("timestamp")
305+
))
190306

191307
# Post-run enrichment: pull demo links from resolve_demo_link tool calls
192308
demo_map = {}

src/ai_agent/agent/models.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,14 @@
33
from typing import List, Optional, Any, Dict
44
from pydantic import BaseModel, Field
55

6-
from generator.schema import ToolChoice, ToolSelection, Conversation, ConversationStatus, CandidateDoc
6+
from ai_agent.generator.schema import ToolSelection, CandidateDoc
77

88
class ToolRunLog(BaseModel):
99
tool: str
1010
inputs: Dict[str, Any] = Field(default_factory=dict)
1111
summary: str
1212
error: Optional[str] = None
13+
timestamp: Optional[str] = None
1314

1415
class AgentToolSelection(ToolSelection):
1516
tool_calls: List[ToolRunLog] = Field(default_factory=list)

src/ai_agent/agent/tools/gradio_space_tool.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
from pydantic import BaseModel
55
import os, re, logging
66
from .utils import get_pipeline
7-
from utils.utils import _best_runnable_link
8-
from utils.previews import _build_preview_for_vlm
7+
from ai_agent.utils.utils import _best_runnable_link
8+
from ai_agent.utils.previews import _build_preview_for_vlm
99
from gradio_client import Client, handle_file
1010
import tempfile
1111
from pathlib import Path

src/ai_agent/agent/tools/rerank_tool.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
from pydantic import BaseModel
55
import os, re
66

7-
from retriever.software_doc import SoftwareDoc
7+
from ai_agent.retriever.software_doc import SoftwareDoc
88
from .utils import get_pipeline
99

1010
class RerankInput(BaseModel):

src/ai_agent/agent/tools/search_tool.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
from typing import List
44
from pydantic import BaseModel, Field
55

6-
from generator.schema import CandidateDoc
6+
from ai_agent.generator.schema import CandidateDoc
77
from .utils import get_pipeline
88

99
class SearchToolsInput(BaseModel):

0 commit comments

Comments
 (0)