Feature/graphdb#8
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the retriever module architecture by splitting the monolithic embedders.py into focused, single-responsibility modules and adds GraphDB catalog synchronization functionality.
- Splits
retriever/embedders.pyinto separate modules:software_doc.py,text_embedder.py,reranker.py, andvector_index.pyfor better maintainability - Introduces GraphDB integration with automatic catalog syncing via new
catalog/sync.pymodule - Updates
RAGImagingPipelineinitialization to load from persisted index instead of requiring docs parameter
Reviewed Changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/ai_agent/retriever/software_doc.py | Extracted SoftwareDoc schema with field validation and format token extraction |
| src/ai_agent/retriever/text_embedder.py | Extracted text embedding interface and BGE-M3 implementation |
| src/ai_agent/retriever/reranker.py | Extracted CrossEncoder reranking logic |
| src/ai_agent/retriever/vector_index.py | Extracted FAISS index management with fingerprinting and persistence |
| src/ai_agent/catalog/sync.py | New module for GraphDB SPARQL queries and catalog synchronization |
| src/ai_agent/cli.py | Extended CLI with sync command and background catalog refresh |
| src/ai_agent/api/pipeline.py | Updated to load index from disk; removed docs parameter from constructor |
| src/ai_agent/ui/app.py | Updated to load docs from FAISS metadata after index initialization |
| tests/full_test.py | Updated imports and monkeypatch targets for new module structure |
| src/ai_agent/utils/full_processing.py | Refactored as reusable function; added logging |
Comments suppressed due to low confidence (1)
src/ai_agent/utils/full_processing.py:103
- Test is always true, because of this condition.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Now loading the catalog from the graphdb endpoint instead of a static jsonl catalog.