Imaging-Plaza · qchapp · Mar 12, 2026 · Feb 26, 2026 · Feb 28, 2026 · Feb 28, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,9 @@ All notable changes to this project will be documented in this file.
 
 ## [Unreleased]
 
+### Removed
+- **Agent run_example tool**: Removed autonomous tool execution capability from agent. Agent now only recommends tools - all execution requires explicit user approval via approval buttons. This enforces consistent security/UX model where users maintain full control over tool execution. The underlying `gradio_space_tool.py` remains for UI-initiated demo execution.
+
 ### Added
 - **New chat-based interface** (`ai_agent chat`) with conversational AI assistant
   - Chatbot component with rich media rendering (images, files, JSON, code blocks)
@@ -34,6 +37,30 @@ All notable changes to this project will be documented in this file.
 - **YAML Model Configuration**: New `config.yaml` file for flexible model configuration supporting OpenAI, EPFL inference server, and any OpenAI-compatible API endpoints.
 - **Multi-Model Support**: Can now configure different models for agent (main reasoning & tool selection).
 - **Configuration Module**: New `utils/config.py` with Pydantic models for type-safe configuration loading and validation.
+- **3D Lungs Segmentation Tool**: New MCP tool (`agent/tools/lungs_segmentation_tool.py`) that integrates with HuggingFace Space (https://qchapp-3d-lungs-segmentation.hf.space/) for 3D U-Net based lung segmentation in CT volumes. Supports DICOM, NIfTI, and TIFF stack inputs with robust file materialization strategy handling multiple Gradio output formats (FileData dict, URL string, local path, server path).
+- **Tool Usage Analytics**: Real-time visualization in chat UI showing tool call frequency bar chart and timeline plot. Tracks all tool executions with timestamps and success/failure status. Provides users with transparency about which tools are being used during their session.
+- **Downloadable Results Section**: New `download_files` component (gr.File with `file_count="multiple"`) positioned below chatbot for easy access to tool outputs. Files returned by tools (e.g., segmentation masks) are automatically extracted and presented as downloadable items separate from inline previews.
+- **Inline Tool Approval Button**: Button-based tool execution approval appears dynamically in chat flow within a styled group box. Shows "🤖 Tool Recommendation" header with contextual button label (e.g., "🚀 Run Lungs Segmentation") only when tool approval is pending. Replaces previous text-based approval pattern for better UX and extensibility.
+- **Tool Registry System** (`agent/tools/mcp/registry.py`): Centralized tool registration pattern that eliminates tool-specific UI code
+  - `ToolConfig` dataclass for declarative tool configuration with field mappings
+  - `TOOL_REGISTRY` global dictionary for dynamic tool lookup
+  - `CATALOG_NAME_TO_TOOL` reverse mapping dict to handle catalog name → tool name resolution
+  - **Catalog name mapping**: Tools can specify `catalog_names` list to map dataset names (e.g., "lungs-segmentation") to internal tool names (e.g., "lungs_segmentation")
+  - **Clean registration pattern**: Single `ensure_tools_registered()` call in app.py replaces individual tool imports
+  - Generic extraction functions: `extract_preview()`, `extract_downloads()`, `extract_metadata()`
+  - Helper functions: `register_tool()`, `get_tool()`, `list_tools()`, `get_tool_display_name()`, `get_tool_icon()`
+  - `get_tool()` automatically resolves both registry names and catalog names for seamless integration with RAG recommendations
+  - Supports lazy loading to avoid loading heavy dependencies at import time
+- **MCP Tools Subpackage** (`agent/tools/mcp/`): Organized separation of registered imaging tools (MCP protocol) from agent utilities. Base models, registry, and imaging tools (e.g., lungs_segmentation) now in dedicated subpackage for clarity.
+- **Base Tool Models** (`agent/tools/base.py`): Standard Pydantic schemas for tool consistency
+  - `BaseToolOutput`: Standard fields across all tools (success, error, compute_time_seconds, result_preview, result_origin, metadata_text, notes)
+  - `BaseToolInput`: Minimal base class for tool inputs
+  - `ImageToolInput`: Common pattern for image-based tools with image_path and description fields
+- **Tool Registration**: Lungs segmentation tool self-registers with complete field mappings
+  - Preview field: `result_preview`
+  - Download fields: `result_origin`
+  - Metadata field: `metadata_text`
+  - Notes field: `notes`
 
 ### Changed
 - CLI now supports `ai_agent chat`
@@ -60,6 +87,31 @@ All notable changes to this project will be documented in this file.
 - **UI redesign**: File upload moved to dedicated right panel for cleaner workflow
 - **Visual hierarchy**: Header with gradient green banner and logo
 - **Button styling**: Primary actions use Imaging Plaza green theme colors
+- **Tool Approval Workflow**: Replaced text-based approval (responding "yes"/"sure"/"ok" in chat) with explicit button-based approval. Tool execution now requires clicking a dedicated approval button that appears inline in the chat, improving clarity and preventing accidental tool execution.
+- **Chat Output Structure**: Extended Gradio component outputs from 6 to 9 values to support new approval box and download files components. All event handlers (`submit_btn.click`, `msg_input.submit`, `approve_tool_btn.click`, `clear_btn.click`) now consistently yield/return all 9 outputs: chatbot history, state, 3 charts, state display, downloads, approval box visibility, and approval button label.
+- **Tool Approval Button Position**: Moved approval button from standalone position below input controls to inline group box between chatbot and downloads section. Button now appears as part of "🤖 Tool Recommendation" box with dynamic label showing tool name (e.g., "🚀 Run Lungs Segmentation").
+- **Generic Tool Execution** (`ui/handlers.py`): Replaced tool-specific `execute_lungs_segmentation()` with generic `execute_tool_with_approval()`
+  - Dynamic tool lookup via `get_tool(tool_name)` - NO hardcoded tool names
+  - Dynamic input construction: `tool_config.input_model(**params)` - works for any Pydantic schema
+  - Dynamic tool execution: `tool_config.executor(input_obj)` - calls registered executor
+  - Generic field extraction using registry field mappings - NO tool-specific code
+  - Works for ANY tool that registers in TOOL_REGISTRY
+  - Eliminates need for tool-specific if/else chains
+  - **Architectural benefit**: Adding 70+ tools requires ZERO changes to handlers.py
+- **Dynamic Button Labels** (`ui/components.py`): Button text now uses registry helper functions
+  - `get_tool_display_name()` and `get_tool_icon()` provide consistent labels
+  - Replaces hardcoded string formatting: `tool_name.replace('_', ' ').title()`
+  - Button shows proper display name and icon from ToolConfig
+- **Lazy Tool Loading** (`agent/tools/__init__.py`): Only export registry, not all tools
+  - Prevents loading heavy dependencies (nibabel, pydicom) at package import
+  - Tools imported explicitly where needed (e.g., in `ui/app.py`)
+  - Added `_ensure_tools_registered()` function for explicit bulk loading
+  - Fixes import hangs caused by eager tool loading
+- **Tool Import Location** (`ui/app.py`): Import lungs_segmentation_tool to trigger registration before UI launch
+- **LungsSegmentationOutput Schema**: Enhanced with separate `result_origin` (original format file for download), `result_preview` (PNG preview for display), `metadata_text` (file metadata string), and `api_name` fields. Maintains backward compatibility with `result_path` field (now set to preview when available, else origin).
+- **Download vs Display Separation**: Tool results now distinguish between files for download (`result_origin` - TIFF/NIfTI/DICOM) and inline display (`result_preview` - PNG). Downloads section shows original format files while chat shows converted previews for better compatibility.
+- **HuggingFace Space Client Timeout**: Extended timeout to 300 seconds (5 minutes) via `httpx_kwargs={"timeout": 300.0}` parameter in `_make_gradio_client()` to handle slow Space cold starts and large medical imaging file uploads/downloads without timing out. Includes graceful fallback for older gradio_client versions without `httpx_kwargs` support.
+- **Tool Execution Handler**: `execute_tool_with_approval()` in handlers.py now uses `result_preview` for inline images and `result_origin` for downloadable files, ensuring users get both viewable previews in chat and original format files for download.
 
 ### Removed
 - **VLMToolSelector**: Deleted unused `generator/generator.py` containing VLMToolSelector class. The pydantic-ai agent handles all tool selection directly.
@@ -77,6 +129,10 @@ All notable changes to this project will be documented in this file.
 - **Clear Button**: Disabled during processing to prevent race conditions with ongoing requests.
 - **Alternative Tool Requests**: All recommended tools are now automatically added to the exclusion list (banlist) and properly passed to the agent through AgentState, ensuring follow-up requests like "I would like another tool" correctly return different tools.
 - **History Table**: Follow-up requests (without files) no longer create duplicate history entries. Only primary requests with files are logged to the History table.
+- **Duplicate Function Definition**: Removed duplicate `clear_chat()` function definition in `components.py` that was causing syntax errors.
+- **Text-Based Tool Approval Logic**: Removed legacy `_is_affirmative()` check in `handlers.py` that was conflicting with new button-based approval system. Tool execution now only triggered by explicit button click, preventing ambiguous user messages from unintentionally executing tools.
+- **Gradio Component Compatibility**: Changed `gr.Box` to `gr.Group` for approval button container to ensure compatibility with Gradio 5.42.0 (gr.Box not available in this version).
+- **Component Output Count**: Fixed inconsistent yield statements throughout `handle_chat()` generator function - all yields now consistently return 9 values (chatbot, state, 3 charts, state display, downloads, approval box visibility, button label) to match event handler output declarations.
 
 ## [0.1.3] - 2025-10-22
 

diff --git a/config.yaml b/config.yaml
@@ -2,9 +2,12 @@
 
 # Default/fallback model (used for CLI and initial startup)
 agent_model:
-  name: "gpt-5.1"
-  base_url: null                        # null for default OpenAI endpoint
-  api_key_env: "OPENAI_API_KEY"
+  # name: "gpt-5.1"
+  # base_url: null                        # null for default OpenAI endpoint
+  # api_key_env: "OPENAI_API_KEY"
+  name: "openai/gpt-oss-120b"
+  base_url: "https://inference-rcp.epfl.ch/v1"
+  api_key_env: "EPFL_API_KEY"
 
 # Available models for UI dropdown
 available_models:

diff --git a/src/ai_agent/agent/agent.py b/src/ai_agent/agent/agent.py
@@ -13,12 +13,11 @@
 from ai_agent.generator.prompts import get_agent_system_prompt
 from ai_agent.generator.schema import ToolSelection, Conversation, ConversationStatus
 from ai_agent.utils.config import get_config
-from .models import AgentToolSelection, ToolRunLog
+from .models import AgentToolSelection, ToolRunLog, UsageStats
 from .tools.repo_info_tool import tool_repo_summary, RepoSummaryInput
 from ai_agent.agent.utils import coerce_github_url_or_none
 from .tools.search_tool import tool_search_tools, SearchToolsInput
 from .tools.search_alternative_tool import tool_search_alternative, SearchAlternativeInput
-from .tools.gradio_space_tool import tool_run_example, RunExampleInput
 from .utils import AgentState, limit_tool_calls, cap_prepare
 from ai_agent.utils.image_meta import summarize_image_metadata, detect_ext_token
 
@@ -216,39 +215,6 @@ async def repo_info(ctx: RunContext[AgentState], url: str, tool_name: str | None
     return out.model_dump(mode="python")
 
 
-@agent.tool(retries=0, prepare=cap_prepare)
-@limit_tool_calls("run_example", cap=1)
-async def run_example(
-    ctx: RunContext[AgentState],
-    tool_name: str,
-    endpoint_url: str | None = None,
-    extra_text: str | None = None,
-) -> dict:
-    """
-    Run an example / demo for a given tool via its Gradio space.
-
-    Thin wrapper around tools.gradio_space_tool.tool_run_example().
-    """
-    out = tool_run_example(
-        RunExampleInput(
-            tool_name=tool_name,
-            endpoint_url=endpoint_url,
-            extra_text=extra_text,
-        )
-    )
-    ctx.deps.tool_calls.append(
-        {
-            "tool": "run_example",
-            "tool_name": tool_name,
-            "ran": getattr(out, "ran", False),
-            "endpoint_url": getattr(out, "endpoint_url", endpoint_url),
-            "api_name": getattr(out, "api_name", None),
-            "timestamp": datetime.now().isoformat(),
-        }
-    )
-    return out.model_dump(mode="python")
-
-
 # ---------------------------------------------------------------------------
 # High level entry point: run the agent on (text query + image)
 # ---------------------------------------------------------------------------
@@ -387,7 +353,6 @@ def run_agent(
         agent_instance.tool(search_tools, retries=2, prepare=cap_prepare)
         agent_instance.tool(search_alternative, retries=2, prepare=cap_prepare)
         agent_instance.tool(repo_info, retries=2, prepare=cap_prepare)
-        agent_instance.tool(run_example, retries=0, prepare=cap_prepare)
 
     elif num_choices is not None and num_choices != 3:
         log.info(
@@ -403,7 +368,6 @@ def run_agent(
         agent_instance.tool(search_tools, retries=2, prepare=cap_prepare)
         agent_instance.tool(search_alternative, retries=2, prepare=cap_prepare)
         agent_instance.tool(repo_info, retries=2, prepare=cap_prepare)
-        agent_instance.tool(run_example, retries=0, prepare=cap_prepare)
 
     else:
         log.info(f"♻️  Using global agent (model: {effective_model}, num_choices: {effective_num_choices})")
@@ -490,13 +454,24 @@ def run_agent(
             )
         )
 
-    # ---- 8) Wrap into high-level AgentToolSelection ------------------------
+    # ---- 8) Extract usage statistics if available -------------------------
+    usage_stats = None
+    if hasattr(run_result, "usage") and run_result.usage:
+        usage = run_result.usage()
+        usage_stats = UsageStats(
+            total_tokens=usage.total_tokens,
+            input_tokens=usage.input_tokens,
+            output_tokens=usage.output_tokens,
+        )
+
+    # ---- 9) Wrap into high-level AgentToolSelection ------------------------
     return AgentToolSelection(
         conversation=result.conversation,
         choices=result.choices,
         explanation=result.explanation,
         reason=result.reason,
         tool_calls=tool_logs,
+        usage=usage_stats,
     )
 
 

diff --git a/src/ai_agent/agent/models.py b/src/ai_agent/agent/models.py
@@ -11,8 +11,15 @@ class ToolRunLog(BaseModel):
     error: Optional[str] = None
     timestamp: Optional[str] = None
 
+class UsageStats(BaseModel):
+    """Token usage statistics from the agent."""
+    total_tokens: int = 0
+    input_tokens: int = 0
+    output_tokens: int = 0
+
 class AgentToolSelection(ToolSelection):
     tool_calls: List[ToolRunLog] = Field(default_factory=list)
+    usage: Optional[UsageStats] = None
 
     def to_legacy_dict(self) -> Dict[str, Any]:
         # Map to legacy pipeline result shape expected by UI (subset)
@@ -22,10 +29,12 @@ def to_legacy_dict(self) -> Dict[str, Any]:
             "reason": self.reason,
             "explanation": self.explanation,
             "tool_calls": [c.model_dump(mode="python") for c in self.tool_calls],
+            "usage": self.usage.model_dump(mode="python") if self.usage else None,
         }
 
 __all__ = [
     "AgentToolSelection",
     "ToolRunLog",
+    "UsageStats",
     "CandidateDoc",
 ]
diff --git a/src/ai_agent/agent/tools/__init__.py b/src/ai_agent/agent/tools/__init__.py
@@ -0,0 +1,36 @@
+"""Agent tools package."""
+
+# Only export registry - tools will self-register when imported explicitly
+from .mcp import (
+    TOOL_REGISTRY,
+    get_tool,
+    register_tool,
+    list_tools,
+    ensure_mcp_tools_registered,
+)
+
+# Import tools lazily to avoid loading heavy dependencies at package import
+# Tools should be imported explicitly where needed, e.g.:
+#   from ai_agent.agent.tools.mcp.lungs_segmentation_tool import tool_lungs_segmentation
+
+__all__ = [
+    "TOOL_REGISTRY",
+    "get_tool",
+    "register_tool",
+    "list_tools",
+    "ensure_tools_registered",
+]
+
+
+def ensure_tools_registered():
+    """
+    Import all tools to trigger their registration.
+    Call this once at app startup.
+    """
+    from .search_tool import tool_search_tools
+    from .search_alternative_tool import tool_search_alternative
+    from .repo_info_tool import tool_repo_summary
+    from .gradio_space_tool import tool_run_example
+
+    # Import MCP tools
+    ensure_mcp_tools_registered()
diff --git a/src/ai_agent/agent/tools/mcp/__init__.py b/src/ai_agent/agent/tools/mcp/__init__.py
@@ -0,0 +1,51 @@
+"""
+MCP (Model Context Protocol) tools package.
+
+This package contains registered imaging tools that require approval
+and follow the tool registry pattern.
+"""
+
+from .registry import (
+    TOOL_REGISTRY,
+    CATALOG_NAME_TO_TOOL,
+    get_tool,
+    register_tool,
+    list_tools,
+    get_tool_display_name,
+    get_tool_icon,
+    extract_preview,
+    extract_downloads,
+    extract_metadata,
+    extract_output_field,
+    ToolConfig,
+)
+
+from .base import BaseToolInput, BaseToolOutput, ImageToolInput
+
+__all__ = [
+    # Registry
+    "TOOL_REGISTRY",
+    "CATALOG_NAME_TO_TOOL",
+    "get_tool",
+    "register_tool",
+    "list_tools",
+    "get_tool_display_name",
+    "get_tool_icon",
+    "extract_preview",
+    "extract_downloads",
+    "extract_metadata",
+    "extract_output_field",
+    "ToolConfig",
+    # Base models
+    "BaseToolInput",
+    "BaseToolOutput",
+    "ImageToolInput",
+]
+
+
+def ensure_mcp_tools_registered():
+    """
+    Import all MCP tools to trigger their registration.
+    Call this once at app startup.
+    """
+    from . import lungs_segmentation_tool