Skip to content

Feature/speeding#27

Merged
qchapp merged 10 commits into
developfrom
feature/speeding
Apr 15, 2026
Merged

Feature/speeding#27
qchapp merged 10 commits into
developfrom
feature/speeding

Conversation

@qchapp
Copy link
Copy Markdown
Member

@qchapp qchapp commented Apr 11, 2026

This pull request introduces significant improvements to the agent's retrieval, caching, and configuration systems, with a focus on performance, observability, and flexibility. The most notable changes include the addition of a pre-embedding step for the software catalog, remote embedding/reranking support, dynamic agent instance caching, and a new parallel repository verification tool. Configuration files and documentation have been updated to reflect these enhancements.

Retrieval and Catalog Embedding Enhancements

  • Added startup pre-embedding of the software catalog when the FAISS index is empty, controlled by the EMBED_CATALOG_ON_START environment variable. This reduces latency on user requests by ensuring embeddings are ready at startup. [1] [2] [3] [4]
  • Introduced configuration and support for remote embedding and reranking using the EPFL OpenAI-compatible endpoint, with credentials managed via the new EPFL_API_KEY_EMBEDDER variable. Retrieval stack settings can now be specified in config.yaml under the retrieval block. [1] [2] [3]

Agent and Tool Performance Improvements

  • Implemented dynamic caching of Agent instances keyed by model and endpoint configuration, reducing overhead for repeated requests with the same settings.
  • Added detailed per-tool and request-level latency logging, including timing for retrieval, alternative search, and metadata extraction, to improve runtime observability. [1] [2] [3] [4] [5] [6]

Repository Verification Refactor

  • Replaced the legacy repo_info tool with a new repo_info_batch tool that fetches multiple repository summaries in parallel, improving efficiency and simplifying code paths. Single-repo verification now also uses the batch tool for consistency. [1] [2]
  • Added in-memory TTL caching and in-flight deduplication for repository info requests, reducing redundant remote fetches during iterative agent runs.

Configuration and Documentation Updates

  • Updated .env.dist, README.md, and CHANGELOG.md to document new environment variables, configuration options, and the retrieval stack structure. [1] [2] [3] [4] [5] [6] [7]

Bug Fixes and Robustness

  • Improved structured output validation by increasing retry limits and making parsing more tolerant, reducing failures on custom endpoints.
  • Fixed a startup refresh regression in the CLI and improved FAISS rebuild logic on embedder changes.

These changes collectively make the agent more robust, configurable, and performant, especially in environments using remote retrieval components or handling high request volumes.

Copilot AI review requested due to automatic review settings April 11, 2026 14:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on speeding up the imaging-tool RAG agent by adding remote embedder/reranker support, multiple in-process caches (agent instances, repo summaries, previews, metadata), and improved observability/configuration so repeated requests are cheaper and easier to debug.

Changes:

  • Added config-driven retrieval backends (remote/local embedder + reranker) and optional startup catalog pre-embedding.
  • Introduced multiple in-memory caches (preview generation, image metadata, repo info w/ TTL + in-flight dedup, dynamic agent instance reuse) plus latency logging.
  • Refactored repo verification toward batched/parallel behavior (repo_info_batch) and tightened query sanitization to reduce retrieval drift.

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/test_reranker_remote.py Adds unit tests for remote reranker request/response handling.
tests/test_previews_cache.py Adds tests for preview cache TTL/capacity + resize behavior.
tests/test_pipeline_catalog_reader.py Tests catalog reader accepting JSONL + JSON array.
tests/test_gpt4o_vision.py Moves smoke-test execution under main() / __main__ guard.
tests/test_epfl_vision.py Refactors EPFL vision smoke test into main() returning exit codes.
tests/test_embedder_remote.py Adds unit tests for remote embeddings client behavior and dim probing.
tests/test_deepwiki_repo_info.py Adds tests for repo info cache hit + in-flight dedup + clear helper.
src/ai_agent/utils/utils.py Adds _env_flag() helper for robust boolean env parsing.
src/ai_agent/utils/previews.py Adds preview resizing + TTL cache keyed by file fingerprints; removes annotation overlays.
src/ai_agent/utils/image_meta.py Adds in-process metadata cache keyed by (path, mtime_ns, size).
src/ai_agent/utils/image_io.py Removes unused variables while stacking DICOM frames.
src/ai_agent/utils/config.py Adds get_retrieval_config() for retrieval stack configuration.
src/ai_agent/ui/handlers.py Imports run_agent at module scope; forwards precomputed image_metadata into agent call.
src/ai_agent/ui/app.py Adds ensure_logging_initialized() and pre-checks for port fallback availability.
src/ai_agent/retriever/vector_index.py Switches to absolute imports for embedder/doc types.
src/ai_agent/retriever/utils.py Adds device auto-resolution helper for local ST backends (cuda/mps/cpu).
src/ai_agent/retriever/text_embedder.py Reworks embedder into remote OpenAI-compatible client with optional local backend.
src/ai_agent/retriever/reranker.py Reworks reranker into remote OpenAI-compatible client with optional local backend.
src/ai_agent/generator/schema.py Adds validators/coercions for status/rank/accuracy/reason robustness.
src/ai_agent/generator/prompts.py Updates agent tooling rules prompt to use repo_info_batch and constrain tool usage.
src/ai_agent/cli.py Fixes refresh loop timing/unpacking; ensures logging init in run_chat().
src/ai_agent/catalog/sync.py Adds startup “skip sync if fresh” and robust FAISS rebuild-on-incompatibility behavior.
src/ai_agent/api/pipeline.py Adds retrieval stack config loading + optional startup catalog embedding when index empty.
src/ai_agent/agent/tools/utils.py Cleans typing import; (context) provides lightweight catalog loading helper used by tools.
src/ai_agent/agent/tools/search_tool.py Adds retrieval query sanitization using catalog tool names.
src/ai_agent/agent/tools/search_alternative_tool.py Adds retrieval query sanitization for alternative searches.
src/ai_agent/agent/tools/repo_info_tool.py Adds TTL cache + in-flight dedup for repo summaries.
src/ai_agent/agent/tools/query_utils.py Adds sanitize_retrieval_query() to remove repo-drift/low-signal tokens.
src/ai_agent/agent/tools/mcp/init.py Uses import_module for MCP tool registration.
src/ai_agent/agent/tools/deepwiki_tool.py Makes DeepWiki timeout configurable via env.
src/ai_agent/agent/tools/init.py Uses import_module for tool registration imports.
src/ai_agent/agent/agent.py Adds dynamic Agent instance caching, tool timing logs, and repo_info_batch.
README.md Documents new env vars and retrieval config block example.
config.yaml Adds retrieval.embedder / retrieval.reranker configuration blocks.
CHANGELOG.md Documents new performance/config changes (but headings need cleanup).
.env.dist Adds new env vars (EPFL_API_KEY_EMBEDDER, AGENT_OUTPUT_RETRIES, EMBED_CATALOG_ON_START).

Comment thread src/ai_agent/utils/config.py Outdated
Comment thread src/ai_agent/utils/previews.py Outdated
Comment thread src/ai_agent/agent/agent.py
Comment thread src/ai_agent/agent/tools/search_tool.py
Comment thread CHANGELOG.md
Comment thread README.md Outdated
Comment thread src/ai_agent/agent/tools/search_alternative_tool.py Outdated
Comment thread src/ai_agent/catalog/sync.py Outdated
Comment thread src/ai_agent/api/pipeline.py Outdated
qchapp and others added 3 commits April 11, 2026 17:07
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the imaging-agent stack’s performance and configurability by moving retrieval to configurable local/remote backends, adding multiple in-process caches (previews, image metadata, repo summaries, agent instances), and updating the agent prompt/tooling to use batched repository verification.

Changes:

  • Add config-driven remote/local embedder + reranker support (EPFL OpenAI-compatible endpoints) and optional startup catalog pre-embedding.
  • Introduce caching for VLM previews, image metadata, repo summaries (TTL + in-flight dedup), and dynamic Agent instances; add latency logging for observability.
  • Replace single-repo verification with repo_info_batch, plus related prompt/docs/test updates.

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/test_reranker_remote.py Unit tests for remote reranker request/response handling.
tests/test_previews_cache.py Unit tests for preview cache behavior (TTL, capacity, missing files).
tests/test_pipeline_catalog_reader.py Tests for catalog reader supporting JSONL + JSON-array formats.
tests/test_gpt4o_vision.py Refactors smoke-test script to avoid import-time side effects.
tests/test_epfl_vision.py Refactors EPFL vision smoke-test into a main() entrypoint.
tests/test_embedder_remote.py Unit tests for remote embedder request/response handling and dim probing.
tests/test_deepwiki_repo_info.py Adds tests for repo info TTL cache and in-flight dedup helpers.
src/ai_agent/utils/utils.py Adds _env_flag helper for robust boolean env parsing.
src/ai_agent/utils/previews.py Adds preview resizing + in-memory preview cache keyed by file fingerprints.
src/ai_agent/utils/image_meta.py Adds in-process metadata caching keyed by (path, mtime_ns, size).
src/ai_agent/utils/image_io.py Removes unused variables in DICOM series stacking path.
src/ai_agent/utils/config.py Adds get_retrieval_config() to expose retrieval settings from config.yaml.
src/ai_agent/ui/handlers.py Passes precomputed image_metadata into run_agent to avoid duplicate work.
src/ai_agent/ui/app.py Adds ensure_logging_initialized() and improves port fallback selection.
src/ai_agent/retriever/vector_index.py Switches to absolute imports for embedder/doc types.
src/ai_agent/retriever/utils.py Adds _resolve_local_device() helper (cuda/mps/cpu selection).
src/ai_agent/retriever/text_embedder.py Implements remote embeddings client + optional local backend under legacy class name.
src/ai_agent/retriever/reranker.py Implements remote reranker client + optional local backend and payload parsing.
src/ai_agent/generator/schema.py Adds coercion validators to tolerate structured-output drift (status/rank/accuracy/reason).
src/ai_agent/generator/prompts.py Updates agent tool rules to use repo_info_batch and one-search-per-run guidance.
src/ai_agent/cli.py Fixes UI function unpacking, ensures logging early, delays first background refresh.
src/ai_agent/catalog/sync.py Adds “skip if fresh” support and rebuilds FAISS when artifacts are incompatible.
src/ai_agent/api/pipeline.py Loads retrieval backends from config; adds startup catalog embedding and catalog reader.
src/ai_agent/agent/tools/utils.py Caches catalog tool names for query sanitization support.
src/ai_agent/agent/tools/search_tool.py Sanitizes retrieval queries using known tool names before retrieval.
src/ai_agent/agent/tools/search_alternative_tool.py Sanitizes alternative queries similarly to reduce repo/tool-name drift.
src/ai_agent/agent/tools/repo_info_tool.py Adds repo-summary TTL cache and in-flight deduplication.
src/ai_agent/agent/tools/query_utils.py Adds sanitize_retrieval_query() to strip repo-drift and low-signal terms.
src/ai_agent/agent/tools/mcp/init.py Switches MCP tool registration to dynamic import.
src/ai_agent/agent/tools/deepwiki_tool.py Makes DeepWiki timeout configurable via env var.
src/ai_agent/agent/tools/init.py Switches tool registration to dynamic imports.
src/ai_agent/agent/agent.py Adds agent instance caching, latency logging, and repo_info_batch tool adapter.
README.md Documents new env vars and retrieval configuration blocks.
config.yaml Adds retrieval.embedder / retrieval.reranker blocks for config-driven retrieval.
CHANGELOG.md Documents caching, retrieval backend changes, and startup embedding behavior.
.env.dist Adds new env var placeholders and knobs (embedder key, output retries, startup embed).

Comment thread src/ai_agent/agent/tools/repo_info_tool.py Outdated
Comment thread src/ai_agent/utils/image_meta.py
Comment thread src/ai_agent/generator/prompts.py Outdated
Comment thread src/ai_agent/agent/tools/search_tool.py Outdated
Comment thread src/ai_agent/agent/tools/search_alternative_tool.py Outdated
Comment thread CHANGELOG.md
Comment thread src/ai_agent/catalog/sync.py
Comment thread src/ai_agent/agent/tools/deepwiki_tool.py
qchapp and others added 2 commits April 11, 2026 17:33
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 36 changed files in this pull request and generated 4 comments.

Comment thread src/ai_agent/ui/app.py
Comment thread src/ai_agent/ui/handlers.py
Comment thread config.yaml
Comment thread src/ai_agent/generator/prompts.py
@qchapp qchapp requested a review from caviri April 11, 2026 19:28
@qchapp qchapp merged commit 98af8ec into develop Apr 15, 2026
9 of 10 checks passed
@qchapp qchapp deleted the feature/speeding branch April 15, 2026 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants