Feature/tool retrieval#16
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR modernizes the retrieval stack and agent interface: retrieval is now embedding-based with retries and image-aware hints, the agent tools have been refactored to inject file context from state, and imports/configuration are standardized.
Changes:
- Introduces a new retrieval pipeline API (
retrieve,retrieve_no_rerank, retry logic, image metadata hints) plus a comprehensive test suite and README documenting retrieval behavior. - Refactors the agent’s tools and
run_agententrypoint to routeimage_pathsand format hints throughAgentState, adds asearch_alternativetool, and updates prompts and schemas for the new conversation and retrieval semantics. - Standardizes
ai_agent.import paths, updates configuration (including model selection viaconfig.yaml), and tweaks catalog representation (SoftwareDoc) and documentation (CHANGELOG.md).
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_retrieval_pipeline.py | Adds an extensive pytest suite for RAGImagingPipeline covering medical/non-medical queries, edge cases, metadata hints, and retry behavior (note: current sys.path setup doesn’t actually add src to the import path, so import ai_agent... can still fail). |
| tests/README_RETRIEVAL_TESTS.md | Documents the new retrieval test suite, its organization, runtime, and how it validates the simplified embedding-based pipeline. |
| src/ai_agent/utils/utils.py | Fixes SoftwareDoc import to use the standardized ai_agent.retriever.software_doc path, keeping _best_runnable_link intact. |
| src/ai_agent/utils/tags.py | Simplifies supported control tags by dropping [NO_RERANK] and its helper, leaving [REFINE] and exclusion tags consistent with the new behavior. |
| src/ai_agent/utils/previews.py | Points image metadata and IO imports to ai_agent.utils.*, keeping preview-building behavior unchanged. |
| src/ai_agent/ui/app.py | Updates handle_message to call the new run_agent(task, image_paths=..., ...) signature so the agent derives metadata and formats internally. |
| src/ai_agent/retriever/vector_index.py | Adds minor whitespace for readability; load/save and indexing semantics are unchanged. |
| src/ai_agent/retriever/software_doc.py | Adds a validator to coerce list-valued name fields and rewrites to_retrieval_text() to emphasize tasks/anatomy/modality and embed format/task metadata more retrieval-optimally. |
| src/ai_agent/generator/schema.py | Reorders ConversationStatus/Conversation above ToolSelection to fix Pydantic forward-ref issues and keeps the normalization logic for conversation status and reasons. |
| src/ai_agent/generator/prompts.py | Rewrites the selector and agent system prompts for the new semantic retrieval + alternative search flow and parameterizes the maximum number of tool choices via NUM_CHOICES (default 5), though this default now differs from the UI’s fallback of 3. |
| src/ai_agent/api/pipeline.py | Refactors RAGImagingPipeline to accept min_results/max_retries, adds image-hint construction, implements retry/broadening logic in retrieve_no_rerank, and adds a high-level retrieve that applies CrossEncoder reranking. |
| src/ai_agent/agent/utils.py | Extends AgentState to track image_paths and original_formats, and provides quota/prepare helpers to limit tool usage and hide disabled tools. |
| src/ai_agent/agent/tools/utils.py | Updates imports to ai_agent.* and keeps a lazily-initialized singleton RAGImagingPipeline for use by agent tools. |
| src/ai_agent/agent/tools/search_tool.py | Adapts the search tool to pass image_paths and normalized original_formats into the pipeline and switch from manual rerank handling to the new pipeline.retrieve() (docstring still references the removed dictionary-based query expansion). |
| src/ai_agent/agent/tools/search_alternative_tool.py | Adds a new search_alternative tool that constructs an alternative query with soft format tokens and calls pipeline.retrieve() to support explicit agent-driven alternative searches. |
| src/ai_agent/agent/tools/rerank_tool.py | Aligns reranking with the new pipeline by ensuring the original query (without similarity expansion) is used as the reranker input. |
| src/ai_agent/agent/tools/gradio_space_tool.py | Fixes imports to ai_agent.utils.* and preserves the Gradio-based run_example functionality for running remote demos on user images. |
| src/ai_agent/agent/models.py | Simplifies ToolRunLog (removing the unused summary field) and exposes AgentToolSelection as a wrapper around ToolSelection with attached tool call logs. |
| src/ai_agent/agent/agent.py | Overhauls the agent: wires in the new prompts, defines adapters for search_tools, search_alternative, repo_info, and run_example, and rewrites run_agent to require image_paths, derive metadata/format hints, pass them via AgentState, and log tool usage (note: _demo_pipeline is now an unused but expensive global, and run_agent recomputes metadata already produced by _build_preview_for_vlm). |
| config.yaml | Updates the (commented) default agent model name to “gpt-5.1” and keeps the configurable base URL and API key environment variable wiring. |
| CHANGELOG.md | Records the new similarity-based query expansion, retry logic, alternative search tool, configuration module, and the schema forward-ref fix, matching the behavioral changes in this PR. |
8363b48 to
65d1a76
Compare
Contributor
Co-authored-by: qchapp <74377782+qchapp@users.noreply.github.com>
Co-authored-by: qchapp <74377782+qchapp@users.noreply.github.com>
Eliminate redundant metadata extraction in agent pipeline
…aging-plaza-ai-agent into feature/tool-retrieval
…added some tests for multimodalities
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Member
Author
|
Hi @caviri, could you review this PR for me? I already reviewed and I'm waiting for your confirmation before the merge! Thanks |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces major improvements to the agent's retrieval capabilities, restructures the agent's tool interface, and standardizes import paths. The most significant changes include switching to embedding-based similarity for query expansion, adding iterative retrieval with retries and a new agent tool for alternative searches, and refactoring the agent's tool adapters to better handle context and metadata. Several bug fixes and configuration enhancements are also included.
Retrieval and Query Expansion Enhancements:
search_alternativeagent tool, allowing the agent to explicitly request up to 3 alternative searches per conversation, enabling agent-driven iterative refinement. [1] [2]Agent and Tool Interface Refactoring:
AgentState, ensuring tools have necessary context without requiring the LLM to reason about file paths. Therun_agentfunction now requires explicit image paths and automatically generates metadata/context for the LLM and tools. [1] [2]repo_info) and improved tool call logging structure. [1] [2]Configuration and Import Path Improvements:
ai_agent.prefix for consistency. [1] [2] [3] [4]config.yamlto allow easy switching between model versions (e.g., "gpt-5.1" and "gpt-4o").utils/config.py) using Pydantic for type-safe config loading and validation.Bug Fixes:
schema.py.Other Notable Changes:
These changes collectively make the agent more robust, context-aware, and capable of sophisticated, iterative tool retrieval and selection.