Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
194 commits
Select commit Hold shift + click to select a range
03100c0
added new repo info tool with test
qchapp Oct 22, 2025
1d8df82
fixed top choices bug
qchapp Oct 22, 2025
2588007
Merge updates and resolve conflicts in feature-ui
qchapp Oct 22, 2025
9be6604
added new repo info tool with test
qchapp Oct 22, 2025
c71fd57
Update tests/test_repo_summary.py
qchapp Oct 22, 2025
cde0d7a
Update src/ai_agent/agent/tools/repo_info_tool.py
qchapp Oct 22, 2025
00bf69e
Update src/ai_agent/agent/tools/repo_info_tool.py
qchapp Oct 22, 2025
c091a75
Update src/ai_agent/agent/tools/repo_info_tool.py
qchapp Oct 22, 2025
5a07ba3
Update src/ai_agent/agent/tools/repo_info_tool.py
qchapp Oct 22, 2025
22000fe
Merge pull request #6 from Imaging-Plaza/pydantic-ai
qchapp Oct 22, 2025
ef53fba
fixed png images preview broken
qchapp Oct 22, 2025
d6adad5
Merge branch 'develop' into feature-ui
qchapp Oct 22, 2025
a206f1f
Update tests/test_repo_summary.py
qchapp Oct 22, 2025
848bd7a
Update tests/test_repo_summary.py
qchapp Oct 22, 2025
4d6feaa
Update src/ai_agent/agent/tools/repo_info_tool.py
qchapp Oct 22, 2025
0000052
Update src/ai_agent/agent/tools/repo_info_tool.py
qchapp Oct 22, 2025
17148e6
Merge pull request #7 from Imaging-Plaza/feature-ui
qchapp Oct 22, 2025
c617ab8
New tool running software on provided image via gradio client
qchapp Oct 22, 2025
9f5763b
Cleaned ui to hide results before query
qchapp Oct 22, 2025
47444d9
integrated graphDB and refactored full catalog builder
qchapp Nov 4, 2025
56fda63
updated env variables for graphdb
qchapp Nov 4, 2025
2bca90a
changes to the catalog
qchapp Nov 5, 2025
e1bf393
Edited small typo in comment
qchapp Nov 10, 2025
e85822f
Removed unused function
qchapp Nov 10, 2025
0b80e08
Unused variable removed
qchapp Nov 10, 2025
c8aa5b8
Unused import
qchapp Nov 10, 2025
c64f157
Unused import
qchapp Nov 10, 2025
bf7b3b5
Indented piece of code wrongly indented
qchapp Nov 10, 2025
3248c73
Merge pull request #8 from Imaging-Plaza/feature/graphdb
qchapp Nov 10, 2025
0f0c61a
Removed unused variable
qchapp Nov 10, 2025
594aa77
Changed indentation
qchapp Nov 10, 2025
7b72e50
Update src/ai_agent/ui/app.py
qchapp Nov 10, 2025
a514262
corrected minor issues for PR
qchapp Nov 10, 2025
40776d3
Merge branch 'feature/runnable-example' of https://github.com/qchapp/…
qchapp Nov 10, 2025
d0d8eb4
fixed minor issues for PR
qchapp Nov 10, 2025
62a9399
Update src/ai_agent/agent/tools/gradio_space_tool.py
qchapp Nov 10, 2025
d816d15
Merge pull request #9 from Imaging-Plaza/feature/runnable-example
qchapp Nov 10, 2025
d8e7f14
fixed bug related to extraction of runnable example link
qchapp Nov 11, 2025
e7b0f95
first implementation of deepwiki mcp server
qchapp Nov 12, 2025
9146a17
fixed some small error during usage of the app
qchapp Nov 12, 2025
35e46ab
simplification of the pipeline and now forcing to use the agent + bet…
qchapp Nov 12, 2025
6b842c0
deletion of the unused functions (ex-vlm used functions)
qchapp Nov 12, 2025
14e8a34
Minor changes to clean the code
qchapp Nov 12, 2025
561eabf
now added repocards fallback
qchapp Nov 19, 2025
f5971b3
Merge pull request #10 from Imaging-Plaza/feature/usage-fix
qchapp Nov 19, 2025
190e534
removed old models code
qchapp Nov 19, 2025
1456bdd
first test using config loaded from yaml file
qchapp Nov 19, 2025
bde7417
made the agent work with epfl rcp model
qchapp Nov 20, 2025
0b6ba1f
first refactor of the UI
qchapp Nov 22, 2025
f120c5c
deleted temperature parameter in config
qchapp Nov 22, 2025
13b54cb
resolved duplicate issue changelog file
qchapp Nov 22, 2025
5e3a3a0
fixed issue in changelog
qchapp Nov 22, 2025
eda7b93
fixed issue in docstring
qchapp Nov 22, 2025
5a79167
add safer yaml loading to prevent for wront yaml syntax
qchapp Nov 22, 2025
6ec6585
Merge pull request #11 from Imaging-Plaza/feature/model-switch
qchapp Nov 22, 2025
97373a2
simplification of the code for deep wiki tool
qchapp Nov 23, 2025
95ee9ae
refactor of ui architecture
qchapp Dec 3, 2025
532af43
ui design change to look more like imaging plaza
qchapp Dec 4, 2025
7b1e282
changed naming from github_api to repocards
qchapp Dec 9, 2025
bb0c727
Merge branch 'develop' into feature/deep-wiki-mcp
qchapp Dec 9, 2025
6a4265d
Update src/ai_agent/agent/tools/deepwiki_tool.py
qchapp Dec 9, 2025
0e9dfde
Update CHANGELOG.md
qchapp Dec 9, 2025
5d51f98
Initial plan
Copilot Dec 9, 2025
14b04c5
Fix _clip return type annotation to Tuple[str, bool]
Copilot Dec 9, 2025
00470b5
Merge pull request #13 from Imaging-Plaza/copilot/sub-pr-12
qchapp Dec 9, 2025
28b3b85
Update src/ai_agent/agent/tools/repo_info_tool.py
qchapp Dec 9, 2025
649942a
Update src/ai_agent/agent/utils.py
qchapp Dec 9, 2025
f9190c1
implemented the changes proposed by copilot
qchapp Dec 9, 2025
f7a5e93
Merge branch 'feature/deep-wiki-mcp' of https://github.com/qchapp/ima…
qchapp Dec 9, 2025
75d8c67
added settings button to change parameters and plot about tool usage
qchapp Dec 9, 2025
22ac4a0
added unit tests with markdown explication
qchapp Jan 22, 2026
dbd3bb7
Merge pull request #12 from Imaging-Plaza/feature/deep-wiki-mcp
qchapp Jan 22, 2026
c8cc648
Added timestamps of each tool call in the visualization
qchapp Jan 22, 2026
1916bd5
Merge develop into feature/interface-refactor
qchapp Jan 22, 2026
6e17783
Update CHANGELOG.md
qchapp Jan 22, 2026
41d8752
Update src/ai_agent/ui/components.py
qchapp Jan 22, 2026
6dd0ac9
Update src/ai_agent/ui/visualizations.py
qchapp Jan 22, 2026
ab63149
implemented copilot proposed changes
qchapp Jan 22, 2026
d3a0973
Initial plan
Copilot Jan 22, 2026
4714e4a
Improve affirmative detection with word boundary matching and context…
Copilot Jan 22, 2026
dc38c43
Refactor affirmative detection: extract constants and pre-compile reg…
Copilot Jan 22, 2026
2cbf548
Improve negation handling with better regex and clearer comments
Copilot Jan 22, 2026
3d4c9f5
Condense comments for better readability
Copilot Jan 22, 2026
54db1ca
Merge pull request #15 from Imaging-Plaza/copilot/sub-pr-14
qchapp Jan 22, 2026
023e432
switch of model during usage now working
qchapp Jan 24, 2026
143852c
added dependencies version in pyproject.toml
qchapp Jan 27, 2026
f95fb36
Merge pull request #14 from Imaging-Plaza/feature/interface-refactor
qchapp Jan 27, 2026
0d4a3cd
query expanded to improve retrieval
qchapp Jan 28, 2026
18f51d5
big update
qchapp Jan 28, 2026
65d1a76
continued rebase
qchapp Jan 28, 2026
74b4e4d
removed useless imports
qchapp Jan 28, 2026
9f66bd1
Initial plan
Copilot Jan 29, 2026
33fc064
Eliminate redundant metadata computation in run_agent
Copilot Jan 29, 2026
2f273d1
Fix: only pass metadata when it matches current files
Copilot Jan 29, 2026
9db4d1e
small comments update
qchapp Jan 29, 2026
e98c671
Merge pull request #17 from Imaging-Plaza/copilot/sub-pr-16
qchapp Jan 29, 2026
a5c8266
deleted useless line
qchapp Jan 29, 2026
ce07acf
Merge branch 'feature/tool-retrieval' of https://github.com/qchapp/im…
qchapp Jan 29, 2026
5b508fb
fixed issues found during test of the interface - now working properly
qchapp Jan 29, 2026
3908ccd
fixed a big issue with the way the image was passed to the model and …
qchapp Jan 29, 2026
fadc984
Update tests/test_epfl_vision.py
qchapp Jan 29, 2026
c953675
Update src/ai_agent/utils/previews.py
qchapp Jan 29, 2026
73dc3f3
Update src/ai_agent/utils/previews.py
qchapp Jan 29, 2026
11a7b44
Update src/ai_agent/utils/previews.py
qchapp Jan 29, 2026
bad59d8
fixed empty image paths error handling
qchapp Jan 29, 2026
1d8adb2
Merge pull request #16 from Imaging-Plaza/feature/tool-retrieval
qchapp Feb 2, 2026
ed72050
hf deployment workflows
qchapp Feb 2, 2026
eb179fe
new deployment
qchapp Feb 3, 2026
c4dcc98
changed endpoint
qchapp Feb 3, 2026
1c39b85
fixed protocol usage for deepwiki mcp
qchapp Feb 4, 2026
83313db
trying external epfl enpoint
qchapp Feb 10, 2026
d0d821b
fixed different issues (see plaza board for more)
qchapp Feb 11, 2026
4f83d28
new readme
qchapp Feb 11, 2026
4db1f38
fixed config usage and logo
qchapp Feb 18, 2026
6a32fb2
reverted commit..
qchapp Feb 18, 2026
7551c64
updated default config and logo display
qchapp Feb 18, 2026
30081b1
Update src/ai_agent/agent/agent.py
qchapp Feb 18, 2026
9f92585
Update src/ai_agent/agent/agent.py
qchapp Feb 18, 2026
829ad0a
implemented copilot propositions
qchapp Feb 18, 2026
36fa242
Merge branch 'feature/minor-fixes' of https://github.com/qchapp/imagi…
qchapp Feb 18, 2026
9488583
Initial plan
Copilot Feb 18, 2026
13b66c6
Initial plan
Copilot Feb 18, 2026
5a3ac9d
Fix quota handling to use valid ConversationStatus.COMPLETE and remov…
Copilot Feb 18, 2026
4b6b130
Optimize catalog lookup to avoid full pipeline initialization
Copilot Feb 18, 2026
be8cb22
Merge pull request #20 from Imaging-Plaza/copilot/sub-pr-19
qchapp Feb 18, 2026
40e9bf3
Merge pull request #21 from Imaging-Plaza/copilot/sub-pr-19-again
qchapp Feb 18, 2026
e7c6056
Merge pull request #19 from Imaging-Plaza/feature/minor-fixes
qchapp Feb 18, 2026
c55c836
getting ready for pr
qchapp Feb 26, 2026
a75b846
autonomous lungs segmentation tool call
qchapp Feb 26, 2026
47efd74
Update src/ai_agent/ui/components.py
qchapp Feb 28, 2026
8ce0c45
Update src/ai_agent/agent/tools/mcp/registry.py
qchapp Feb 28, 2026
ea8ddfe
implemented propositions from copilot
qchapp Feb 28, 2026
63e671f
fixed a limited file size limit
qchapp Feb 28, 2026
5f57285
Update CHANGELOG.md
qchapp Feb 28, 2026
0131d8d
Update CHANGELOG.md
qchapp Feb 28, 2026
030181a
implemented changes proposed by copilot
qchapp Feb 28, 2026
f85cc69
first version of the doc generated by copilot
qchapp Mar 1, 2026
8add652
renamed to imaging-plaza
qchapp Mar 1, 2026
7c5d2bb
trying to deploy again
qchapp Mar 2, 2026
b81ba04
changed documantation slightly
qchapp Mar 5, 2026
74e4f06
improved documentation
qchapp Mar 9, 2026
efebb61
preparing for PR
qchapp Mar 10, 2026
db191e1
Update docs/reference/environment.md
qchapp Mar 10, 2026
a7eac91
Update docs/getting-started/installation.md
qchapp Mar 10, 2026
5a3bad0
Update mkdocs.yml
qchapp Mar 10, 2026
7d008dd
Update docs/architecture/overview.md
qchapp Mar 10, 2026
202b482
Initial plan
Copilot Mar 10, 2026
8fbe307
wrote that the testing area is under development still
qchapp Mar 10, 2026
064323b
Merge branch 'feature/docs' of https://github.com/qchapp/imaging-plaz…
qchapp Mar 10, 2026
6549757
docs: update agent.md to reflect real code paths (fix VLMToolSelector…
Copilot Mar 10, 2026
cdc2ca9
Merge pull request #24 from Imaging-Plaza/copilot/sub-pr-23
qchapp Mar 10, 2026
306758c
changed tests example for agent.md docs
qchapp Mar 10, 2026
03e1428
Update docs/user-guide/chat-interface.md
qchapp Mar 10, 2026
a5e71e1
Update docs/user-guide/file-formats.md
qchapp Mar 10, 2026
fd28970
Update docs/reference/environment.md
qchapp Mar 10, 2026
667f26a
Update docs/architecture/retrieval.md
qchapp Mar 10, 2026
5d26316
Update docs/architecture/overview.md
qchapp Mar 10, 2026
79562d9
Update docs/user-guide/file-formats.md
qchapp Mar 10, 2026
00726b2
fixed some changes proposed by copilot
qchapp Mar 10, 2026
4b5a65b
Merge pull request #23 from Imaging-Plaza/feature/docs
qchapp Mar 10, 2026
0be9738
Merge pull request #22 from Imaging-Plaza/feature/3d-lungs-autocall
qchapp Mar 12, 2026
6dbba79
preparing for PR
qchapp Mar 12, 2026
f0fc57a
preparing for PR
qchapp Mar 12, 2026
aef25a1
small refactor of the tests and files for the PR
qchapp Mar 14, 2026
3094045
Potential fix for pull request finding
qchapp Mar 15, 2026
ccb42c5
Potential fix for pull request finding
qchapp Mar 15, 2026
4768cc1
Potential fix for pull request finding
qchapp Mar 15, 2026
09d0474
Potential fix for pull request finding
qchapp Mar 15, 2026
95b9622
Potential fix for pull request finding
qchapp Mar 15, 2026
c062252
big refactor for the PR: fixed solution proposed by copilot, edited d…
qchapp Mar 16, 2026
05f49f6
Potential fix for pull request finding
qchapp Mar 16, 2026
f68f9e7
Potential fix for pull request finding
qchapp Mar 16, 2026
ef2ff97
Potential fix for pull request finding
qchapp Mar 16, 2026
4f21249
Potential fix for pull request finding
qchapp Mar 16, 2026
5c49f4d
Potential fix for pull request finding
qchapp Mar 16, 2026
8c7bae6
docs(AGENTS.md): Refactor code structure for improved readability and…
caviri Mar 18, 2026
1631455
Merge pull request #26 from Imaging-Plaza/feat/test
caviri Mar 18, 2026
99b18b7
added EPFL reranker/embedder along with some pytest tests
qchapp Mar 21, 2026
bba601f
edited config doc in readme
qchapp Mar 21, 2026
cd32760
parallelized deep wiki and deleted image metadata annotation
qchapp Mar 22, 2026
5967f44
moved function to utils from pipeline
qchapp Mar 24, 2026
b3780d1
speeding a bit more
qchapp Apr 11, 2026
9d69eb3
Potential fix for pull request finding
qchapp Apr 11, 2026
7d81f7f
copilot suggestions
qchapp Apr 11, 2026
34ee2f4
implemented copilot suggestions
qchapp Apr 11, 2026
5a176fc
Potential fix for pull request finding
qchapp Apr 11, 2026
ddc595a
copilot changes again
qchapp Apr 11, 2026
98af8ec
Merge pull request #27 from Imaging-Plaza/feature/speeding
qchapp Apr 15, 2026
c939f84
cleaned versioning for PR
qchapp Apr 15, 2026
c6f730c
removed unused code on workflow
qchapp Apr 15, 2026
979f7d2
new gif in markdown
qchapp Apr 15, 2026
85867be
changed path
qchapp Apr 15, 2026
35d0cdf
changed path
qchapp Apr 15, 2026
169b26b
changed size
qchapp Apr 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 29 additions & 29 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -1,29 +1,29 @@
{
"name": "ai-agent-dev",
"build": {
"dockerfile": "Dockerfile"
},

// This is where your repo will be mounted inside the container
"remoteUser": "vscode",
"workspaceFolder": "/workspaces/${localWorkspaceFolderBasename}",

"customizations": {
"vscode": {
"settings": {
"python.defaultInterpreterPath": "${workspaceFolder}/.venv/bin/python",
"python.envFile": "${workspaceFolder}/.env"
},
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance",
"tamasfe.even-better-toml"
]
}
},

"forwardPorts": [7860],

// Install project in editable mode after the container is built
"postCreateCommand": "rm -rf .venv && uv venv && uv pip install -e . && echo '. $PWD/.venv/bin/activate' >> /home/vscode/.bashrc"
}
{
"name": "ai-agent-dev",
"build": {
"dockerfile": "Dockerfile"
},
// This is where your repo will be mounted inside the container
"remoteUser": "vscode",
"workspaceFolder": "/workspaces/${localWorkspaceFolderBasename}",
"customizations": {
"vscode": {
"settings": {
"python.defaultInterpreterPath": "${workspaceFolder}/.venv/bin/python",
"python.envFile": "${workspaceFolder}/.env"
},
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance",
"tamasfe.even-better-toml"
]
}
},
"forwardPorts": [7860],
// Install project in editable mode after the container is built
"postCreateCommand": "rm -rf .venv && uv venv && uv pip install -e . && echo '. $PWD/.venv/bin/activate' >> /home/vscode/.bashrc"
}
6 changes: 6 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
.git
__pycache__
*.pyc
.env
.env.*
tests
56 changes: 38 additions & 18 deletions .env.dist
Original file line number Diff line number Diff line change
@@ -1,18 +1,38 @@
OPENAI_API_KEY=sk-xxxx
# Optional model overrides (defaults work):
OPENAI_MODEL=gpt-4o

# Software catalog
SOFTWARE_CATALOG=path/to/your/catalog.jsonl

# Pipeline configuration
TOP_K=8 # Number of candidates to retrieve
NUM_CHOICES=3 # Number of tools to recommend
USE_AGENT=1 # Use pydantic-ai agent (1) or standard pipeline (0)

# Logging configuration
LOGLEVEL_CONSOLE=WARNING
LOGLEVEL_FILE=INFO
FILE_LOG=1
LOG_DIR=logs
LOG_PROMPTS=0 # write selector prompt snapshots
# API Keys
OPENAI_API_KEY=sk-xxxx
GITHUB_TOKEN=ghp_xxxx

# Additional API keys for alternative models (if using EPFL or other providers)
EPFL_API_KEY=sk-xxxx
EPFL_API_KEY_EMBEDDER=sk-xxxx

# Software catalog
SOFTWARE_CATALOG=path/to/your/catalog.jsonl

# Pipeline configuration
TOP_K=8 # Number of candidates to retrieve
NUM_CHOICES=3 # Number of tools to recommend
USE_AGENT=1 # Use pydantic-ai agent (1) or standard pipeline (0)
AGENT_OUTPUT_RETRIES=3 # Structured output validation retries
EMBED_CATALOG_ON_START=1 # Pre-embed catalog at startup if FAISS is empty

# Logging configuration
LOGLEVEL_CONSOLE=WARNING
LOGLEVEL_FILE=INFO
FILE_LOG=1
LOG_DIR=logs
LOG_PROMPTS=0 # write selector prompt snapshots

# Path to config.yaml
CONFIG_PATH=path/to/custom/config.yaml

# GraphDB
GRAPHDB_GRAPH=
GRAPHDB_URL=
GRAPHDB_USER=
GRAPHDB_PASSWORD=
GRAPHDB_QUERY_FILE=full/path/to/your/query.rq
SYNC_EVERY_HOURS=24 # set 0 (or leave empty) to disable periodic refresh

OUTPUT_JSONLD=path/to/your/catalog.jsonld
OUTPUT_JSONL=path/to/your/catalog.jsonl
255 changes: 126 additions & 129 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -1,130 +1,127 @@
# AI Agent — Copilot Instructions

This is a **RAG + VLM imaging tool recommender** that helps users find the right imaging software for their images and tasks. Users drop an image, describe their task, and get ranked software recommendations with demo links.

## Architecture Overview

The system follows a two-stage pipeline:

1. **Retrieval Stage** (`retriever/`, `api/pipeline.py`): Fast text search using BGE-M3 embeddings + CrossEncoder reranker. No LLM calls. Returns top-K candidates.

2. **Selection Stage** (`generator/`): Single VLM call (OpenAI GPT-4o/mini) that sees the image + candidates + metadata and returns ranked recommendations with accuracy scores.

### Key Components

- **`api/pipeline.RAGImagingPipeline`**: Main orchestrator. Handles file validation, metadata extraction, retrieval, and VLM selection.
- **`retriever/embedders.py`**: FAISS vector index with BGE-M3 + CrossEncoder reranker for candidate retrieval.
- **`generator/generator.VLMToolSelector`**: Vision-language model that selects best tools from candidates.
- **`utils/image_meta.py`**: Robust metadata extraction for DICOM, NIfTI, TIFF stacks with medical imaging focus.
- **`utils/tags.py`**: Control tags system for query refinement (`[EXCLUDE:tool1|tool2]`, `[NO_RERANK]`, `[REFINE]`).

## Data Flow Patterns

### Input Processing
- Files validated via `utils/file_validator.py` (size limits, format checks)
- Images converted to PNG previews for VLM via `utils/previews.py`
- Metadata extracted preserving original format info (critical for format compatibility matching)
- Format tokens added to retrieval query (e.g. `format:DICOM format:NIfTI`)

### Retrieval Query Construction
```python
# Clean task text + format tokens from uploaded files
query = f"{clean_task} format:{ext_tokens}" # e.g. "segment lungs format:DICOM"
```

### VLM Selection Input
The VLM receives:
- **Text**: User task + candidate table + original file metadata
- **Image**: PNG preview (converted from any format)
- **Metadata**: Original extension, dimensions, file info (crucial for IO compatibility)

## Critical Patterns

### Error Handling
- **Graceful degradation**: If image conversion fails, continue text-only
- **Robust metadata**: All metadata extraction wrapped in try/catch with sensible defaults
- **File validation**: Early validation prevents downstream errors

### Control Tags System
Users can control behavior via tags in their queries:
- `[EXCLUDE:toolname1|toolname2]` - Exclude specific tools from results
- `[NO_RERANK]` - Skip CrossEncoder reranker (faster, less accurate)
- `[REFINE]` - Force clarification turn for alternatives

### Conversation Flow
- **Complete**: Normal success with tool recommendations
- **Needs Clarification**: VLM asks followup questions when task is ambiguous
- **Terminal No-Tool**: No suitable tools found with explanation

## Development Workflows

### Running the App
```bash
# Install with pip using pyproject.toml
pip install -e ".[dev]"

# Configure .env with OPENAI_API_KEY and SOFTWARE_CATALOG path
ai_agent ui # Launches Gradio on port 7860
```

### Testing
- **`tests/full_test.py`**: End-to-end pipeline tests driven by `tests/data/test_data.json`
- Uses test doubles for VLM calls to avoid API costs
- Run with: `pytest tests/`

### Change Documentation
- **`CHANGELOG.md`**: Follow [Keep a Changelog](https://keepachangelog.com/) format
- Use semantic versioning with sections: Added, Changed, Deprecated, Removed, Fixed, Security
- Update CHANGELOG.md for ALL user-facing changes before merging PRs
- Format: `### Added\n- New feature description` under version heading
- Version entries: `## [x.y.z] - YYYY-MM-DD`

### Environment Management
- **uv**: Fast Python package manager used in `tools/image/Dockerfile`
- Creates isolated `.venv` environments for reproducible builds
- Dockerfile uses `uv venv && uv pip install -e .` pattern for container builds

### Logging & Debugging
- Set `LOG_PROMPTS=1` to save VLM prompts + images to `logs/`
- File logs in `logs/app_YYYYMMDD.log` with structured JSON events
- Console/file log levels configurable via `.env`

## Project Conventions

### Schema Patterns
- **Pydantic models** in `generator/schema.py` with robust field validation and aliasing for catalog compatibility
- **Enum-based** conversation states and tool reasons for type safety
- **Field normalization**: Dimensions (2D/3D/4D), modalities (CT/MRI/XR), file formats via validators

### Catalog Integration
- Software catalog in JSONL format following schema.org SoftwareSourceCode structure
- **Runnable examples**: Links to HuggingFace Spaces, notebooks, web demos
- **Supporting data**: Format compatibility info used for matching

### Module Boundaries
- `api/`: Pipeline orchestration, no UI dependencies
- `generator/`: Pure VLM logic, no retrieval dependencies
- `retriever/`: Pure vector search, no generation dependencies
- `utils/`: Shared utilities, no business logic
- `ui/`: Gradio interface only

### Configuration
- Environment-based config via `.env` (API keys, model names, catalog paths)
- Sensible defaults for all settings
- No hardcoded paths or credentials

## Medical Imaging Context

This tool specializes in medical/scientific imaging:
- **Modalities**: CT, MRI, X-ray, Ultrasound, PET, SPECT, Microscopy
- **Formats**: DICOM, NIfTI, TIFF stacks, standard images
- **Dimensions**: 2D images, 3D volumes, 4D timeseries
- **Tasks**: Segmentation, registration, analysis, visualization

The VLM selection considers format compatibility as a primary factor - tools supporting the user's input format are strongly preferred.

## Security Notes
- Only makes external calls to OpenAI VLM API (with user image preview)
- Never uploads user data to third-party tool demos
- Returns links only; user chooses whether to visit demos
# AI Agent — Copilot Instructions

This is a **RAG + VLM imaging tool recommender** that helps users find the right imaging software for their images and tasks. Users drop an image, describe their task, and get ranked software recommendations with demo links.

## Architecture Overview

The system follows a two-stage pipeline:

1. **Retrieval Stage** (`retriever/`, `api/pipeline.py`): Fast text search using BGE-M3 embeddings + CrossEncoder reranker. No LLM calls. Returns top-K candidates.

2. **Selection Stage** (`generator/`): Single VLM call (OpenAI GPT-4o/mini) that sees the image + candidates + metadata and returns ranked recommendations with accuracy scores.

### Key Components

- **`api/pipeline.RAGImagingPipeline`**: Main orchestrator. Handles file validation, metadata extraction, retrieval, and VLM selection.
- **`retriever/text_embedder.py`**, **`retriever/vector_index.py`**, **`retriever/reranker.py`**, **`retriever/software_doc.py`**: Embedding, FAISS indexing, reranking, and catalog schema for retrieval.
- **`agent/agent.py`**: PydanticAI agent that orchestrates tool search, alternatives, and recommendation assembly.
- **`utils/image_meta.py`**: Robust metadata extraction for DICOM, NIfTI, TIFF stacks with medical imaging focus.
- **`utils/tags.py`**: Control tag parsing/stripping utilities (notably `[EXCLUDE:tool1|tool2]`).

## Data Flow Patterns

### Input Processing
- Files validated via `utils/file_validator.py` (size limits, format checks)
- Images converted to PNG previews for VLM via `utils/previews.py`
- Metadata extracted preserving original format info (critical for format compatibility matching)
- Format tokens added to retrieval query (e.g. `format:DICOM format:NIfTI`)

### Retrieval Query Construction
```python
# Clean task text + format tokens from uploaded files
query = f"{clean_task} format:{ext_tokens}" # e.g. "segment lungs format:DICOM"
```

### VLM Selection Input
The VLM receives:
- **Text**: User task + candidate table + original file metadata
- **Image**: PNG preview (converted from any format)
- **Metadata**: Original extension, dimensions, file info (crucial for IO compatibility)

## Critical Patterns

### Error Handling
- **Graceful degradation**: If image conversion fails, continue text-only
- **Robust metadata**: All metadata extraction wrapped in try/catch with sensible defaults
- **File validation**: Early validation prevents downstream errors

### Control Tags System
Users can control behavior via tags in their queries:
- `[EXCLUDE:toolname1|toolname2]` - Exclude specific tools from results

### Conversation Flow
- **Complete**: Normal success with tool recommendations
- **Needs Clarification**: VLM asks followup questions when task is ambiguous
- **Terminal No-Tool**: No suitable tools found with explanation

## Development Workflows

### Running the App
```bash
# Install with pip using pyproject.toml
pip install -e ".[dev]"

# Configure .env with OPENAI_API_KEY and SOFTWARE_CATALOG path
ai_agent chat # Launches Gradio chat UI
```

### Testing
- Run targeted tests in `tests/` (e.g., retrieval, agent tools, repo info)
- Run with: `pytest tests/`

### Change Documentation
- **`CHANGELOG.md`**: Follow [Keep a Changelog](https://keepachangelog.com/) format
- Use semantic versioning with sections: Added, Changed, Deprecated, Removed, Fixed, Security
- Update CHANGELOG.md for ALL user-facing changes before merging PRs
- Format: `### Added\n- New feature description` under version heading
- Version entries: `## [x.y.z] - YYYY-MM-DD`

### Environment Management
- **uv**: Fast Python package manager used in `tools/image/Dockerfile`
- Creates isolated `.venv` environments for reproducible builds
- Dockerfile uses `uv venv && uv pip install -e .` pattern for container builds

### Logging & Debugging
- Set `LOG_PROMPTS=1` to save VLM prompts + images to `logs/`
- File logs in `logs/app_YYYYMMDD.log` with structured JSON events
- Console/file log levels configurable via `.env`

## Project Conventions

### Schema Patterns
- **Pydantic models** in `generator/schema.py` with robust field validation and aliasing for catalog compatibility
- **Enum-based** conversation states and tool reasons for type safety
- **Field normalization**: Dimensions (2D/3D/4D), modalities (CT/MRI/XR), file formats via validators

### Catalog Integration
- Software catalog in JSONL format following schema.org SoftwareSourceCode structure
- **Runnable examples**: Links to HuggingFace Spaces, notebooks, web demos
- **Supporting data**: Format compatibility info used for matching

### Module Boundaries
- `api/`: Pipeline orchestration, no UI dependencies
- `generator/`: Pure VLM logic, no retrieval dependencies
- `retriever/`: Pure vector search, no generation dependencies
- `utils/`: Shared utilities, no business logic
- `ui/`: Gradio interface only

### Configuration
- Environment-based config via `.env` (API keys, model names, catalog paths)
- Sensible defaults for all settings
- No hardcoded paths or credentials

## Medical Imaging Context

This tool specializes in medical/scientific imaging:
- **Modalities**: CT, MRI, X-ray, Ultrasound, PET, SPECT, Microscopy
- **Formats**: DICOM, NIfTI, TIFF stacks, standard images
- **Dimensions**: 2D images, 3D volumes, 4D timeseries
- **Tasks**: Segmentation, registration, analysis, visualization

The VLM selection considers format compatibility as a primary factor - tools supporting the user's input format are strongly preferred.

## Security Notes
- Only makes external calls to OpenAI VLM API (with user image preview)
- Never uploads user data to third-party tool demos
- Returns links only; user chooses whether to visit demos
- Prompt logging is optional and local-only
Loading
Loading