Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM ghcr.io/astral-sh/uv:python3.12-bookworm

# Install just and other system dependencies
RUN apt-get update && apt-get install -y \
sudo \
curl \
&& curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to /usr/local/bin \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

# Crear usuario no-root con UID/GID que suele usar VS Code (1000:1000)
RUN useradd -ms /bin/bash -u 1000 vscode \
&& apt-get update && apt-get install -y sudo \
&& echo "vscode ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers

USER vscode
WORKDIR /workspaces
29 changes: 29 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
"name": "ai-agent-dev",
"build": {
"dockerfile": "Dockerfile"
},

// This is where your repo will be mounted inside the container
"remoteUser": "vscode",
"workspaceFolder": "/workspaces/${localWorkspaceFolderBasename}",

"customizations": {
"vscode": {
"settings": {
"python.defaultInterpreterPath": "${workspaceFolder}/.venv/bin/python",
"python.envFile": "${workspaceFolder}/.env"
},
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance",
"tamasfe.even-better-toml"
]
}
},

"forwardPorts": [7860],

// Install project in editable mode after the container is built
"postCreateCommand": "rm -rf .venv && uv venv && uv pip install -e . && echo '. $PWD/.venv/bin/activate' >> /home/vscode/.bashrc"
}
130 changes: 130 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# AI Agent — Copilot Instructions

This is a **RAG + VLM imaging tool recommender** that helps users find the right imaging software for their images and tasks. Users drop an image, describe their task, and get ranked software recommendations with demo links.

## Architecture Overview

The system follows a two-stage pipeline:

1. **Retrieval Stage** (`retriever/`, `api/pipeline.py`): Fast text search using BGE-M3 embeddings + CrossEncoder reranker. No LLM calls. Returns top-K candidates.

2. **Selection Stage** (`generator/`): Single VLM call (OpenAI GPT-4o/mini) that sees the image + candidates + metadata and returns ranked recommendations with accuracy scores.

### Key Components

- **`api/pipeline.RAGImagingPipeline`**: Main orchestrator. Handles file validation, metadata extraction, retrieval, and VLM selection.
- **`retriever/embedders.py`**: FAISS vector index with BGE-M3 + CrossEncoder reranker for candidate retrieval.
- **`generator/generator.VLMToolSelector`**: Vision-language model that selects best tools from candidates.
- **`utils/image_meta.py`**: Robust metadata extraction for DICOM, NIfTI, TIFF stacks with medical imaging focus.
- **`utils/tags.py`**: Control tags system for query refinement (`[EXCLUDE:tool1|tool2]`, `[NO_RERANK]`, `[REFINE]`).

## Data Flow Patterns

### Input Processing
- Files validated via `utils/file_validator.py` (size limits, format checks)
- Images converted to PNG previews for VLM via `utils/previews.py`
- Metadata extracted preserving original format info (critical for format compatibility matching)
- Format tokens added to retrieval query (e.g. `format:DICOM format:NIfTI`)

### Retrieval Query Construction
```python
# Clean task text + format tokens from uploaded files
query = f"{clean_task} format:{ext_tokens}" # e.g. "segment lungs format:DICOM"
```

### VLM Selection Input
The VLM receives:
- **Text**: User task + candidate table + original file metadata
- **Image**: PNG preview (converted from any format)
- **Metadata**: Original extension, dimensions, file info (crucial for IO compatibility)

## Critical Patterns

### Error Handling
- **Graceful degradation**: If image conversion fails, continue text-only
- **Robust metadata**: All metadata extraction wrapped in try/catch with sensible defaults
- **File validation**: Early validation prevents downstream errors

### Control Tags System
Users can control behavior via tags in their queries:
- `[EXCLUDE:toolname1|toolname2]` - Exclude specific tools from results
- `[NO_RERANK]` - Skip CrossEncoder reranker (faster, less accurate)
- `[REFINE]` - Force clarification turn for alternatives

### Conversation Flow
- **Complete**: Normal success with tool recommendations
- **Needs Clarification**: VLM asks followup questions when task is ambiguous
- **Terminal No-Tool**: No suitable tools found with explanation

## Development Workflows

### Running the App
```bash
# Install with pip using pyproject.toml
pip install -e ".[dev]"

# Configure .env with OPENAI_API_KEY and SOFTWARE_CATALOG path
ai_agent ui # Launches Gradio on port 7860
```

### Testing
- **`tests/full_test.py`**: End-to-end pipeline tests driven by `tests/data/test_data.json`
- Uses test doubles for VLM calls to avoid API costs
- Run with: `pytest tests/`

### Change Documentation
- **`CHANGELOG.md`**: Follow [Keep a Changelog](https://keepachangelog.com/) format
- Use semantic versioning with sections: Added, Changed, Deprecated, Removed, Fixed, Security
- Update CHANGELOG.md for ALL user-facing changes before merging PRs
- Format: `### Added\n- New feature description` under version heading
- Version entries: `## [x.y.z] - YYYY-MM-DD`

### Environment Management
- **uv**: Fast Python package manager used in `tools/image/Dockerfile`
- Creates isolated `.venv` environments for reproducible builds
- Dockerfile uses `uv venv && uv pip install -e .` pattern for container builds

### Logging & Debugging
- Set `LOG_PROMPTS=1` to save VLM prompts + images to `logs/`
- File logs in `logs/app_YYYYMMDD.log` with structured JSON events
- Console/file log levels configurable via `.env`

## Project Conventions

### Schema Patterns
- **Pydantic models** in `generator/schema.py` with robust field validation and aliasing for catalog compatibility
- **Enum-based** conversation states and tool reasons for type safety
- **Field normalization**: Dimensions (2D/3D/4D), modalities (CT/MRI/XR), file formats via validators

### Catalog Integration
- Software catalog in JSONL format following schema.org SoftwareSourceCode structure
- **Runnable examples**: Links to HuggingFace Spaces, notebooks, web demos
- **Supporting data**: Format compatibility info used for matching

### Module Boundaries
- `api/`: Pipeline orchestration, no UI dependencies
- `generator/`: Pure VLM logic, no retrieval dependencies
- `retriever/`: Pure vector search, no generation dependencies
- `utils/`: Shared utilities, no business logic
- `ui/`: Gradio interface only

### Configuration
- Environment-based config via `.env` (API keys, model names, catalog paths)
- Sensible defaults for all settings
- No hardcoded paths or credentials

## Medical Imaging Context

This tool specializes in medical/scientific imaging:
- **Modalities**: CT, MRI, X-ray, Ultrasound, PET, SPECT, Microscopy
- **Formats**: DICOM, NIfTI, TIFF stacks, standard images
- **Dimensions**: 2D images, 3D volumes, 4D timeseries
- **Tasks**: Segmentation, registration, analysis, visualization

The VLM selection considers format compatibility as a primary factor - tools supporting the user's input format are strongly preferred.

## Security Notes
- Only makes external calls to OpenAI VLM API (with user image preview)
- Never uploads user data to third-party tool demos
- Returns links only; user chooses whether to visit demos
- Prompt logging is optional and local-only
167 changes: 167 additions & 0 deletions .github/workflows/publish_image_in_GHCR.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
name: Build and Publish Docker Images

on:
push:
branches: [ "main", "develop" ]
pull_request:
branches: [ "main", "develop" ]

jobs:
build-and-publish:
runs-on: ubuntu-latest
permissions:
contents: write # needed to create the release
packages: write # needed to publish the image

# Skip building images for draft PRs
if: |
github.event_name == 'push' ||
(github.event_name == 'pull_request' &&
github.event.pull_request.draft == false)

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Extract version from pyproject.toml
id: project_version
run: |
VERSION=$(grep 'version =' pyproject.toml | sed -E 's/version = "([^"]+)"/\1/')
echo "version=${VERSION}" >> $GITHUB_OUTPUT

- name: Extract changelog section for version
id: changelog
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: |
VERSION="${{ steps.project_version.outputs.version }}"

# Extract the section for this version from CHANGELOG.md
# This awk script finds the section between [VERSION] and the next [VERSION] or end of file
CHANGELOG_SECTION=$(awk -v version="[$VERSION]" '
BEGIN { found=0; content="" }
$0 ~ "^## \\[" {
if (found) exit
if ($0 ~ version) {
found=1
content = $0 "\n"
next
}
}
found { content = content $0 "\n" }
END { print content }
' CHANGELOG.md)

# If no section found, use a default message
if [ -z "$CHANGELOG_SECTION" ]; then
CHANGELOG_SECTION="## Release v${VERSION}\n\nNo changelog entry found for this version."
fi

# Save to file and output
echo "$CHANGELOG_SECTION" > release_notes.md
echo "changelog_file=release_notes.md" >> $GITHUB_OUTPUT

# Also output as multiline string for debugging
{
echo 'content<<EOF'
echo "$CHANGELOG_SECTION"
echo 'EOF'
} >> $GITHUB_OUTPUT

- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Extract metadata for Docker
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository }}
tags: |
# For main branch: latest and version tags
type=raw,value=latest,enable={{is_default_branch}}
type=raw,value=${{ steps.project_version.outputs.version }},enable={{is_default_branch}}
# For develop branch: develop tag
type=raw,value=develop,enable=${{ github.ref == 'refs/heads/develop' }}
# For PRs only: pr-{number} tag
type=ref,event=pr,prefix=pr-
labels: |
org.opencontainers.image.title=${{ github.repository }}
org.opencontainers.image.description=${{ github.event.repository.description }}
org.opencontainers.image.url=${{ github.event.repository.html_url }}
org.opencontainers.image.source=${{ github.event.repository.clone_url }}
org.opencontainers.image.revision=${{ github.sha }}
org.opencontainers.image.licenses=${{ github.event.repository.license.spdx_id }}
# Add cleanup hint for PR images
io.github.pr-image=${{ github.event_name == 'pull_request' && 'true' || 'false' }}

- name: Build and push Docker image
uses: docker/build-push-action@v5
with:
context: .
file: tools/image/Dockerfile
platforms: linux/amd64
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}

- name: Create GitHub Release
# Only create releases for main branch pushes
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
uses: softprops/action-gh-release@v2
with:
tag_name: v${{ steps.project_version.outputs.version }}
name: Release v${{ steps.project_version.outputs.version }}
body_path: ${{ steps.changelog.outputs.changelog_file }}
fail_on_unmatched_files: false

# Clean up PR images when PR is closed
cleanup-pr-image:
runs-on: ubuntu-latest
if: github.event_name == 'pull_request' && github.event.action == 'closed'
permissions:
packages: write

steps:
- name: Delete PR image
uses: actions/github-script@v7
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
const owner = context.repo.owner;
const repo = context.repo.repo;
const packageName = `${owner}/${repo}`;
const prNumber = context.payload.pull_request.number;
const prTag = `pr-${prNumber}`;

try {
// Get all package versions
const { data: versions } = await github.rest.packages.getAllPackageVersionsForPackageOwnedByOrg({
package_type: 'container',
package_name: packageName,
org: owner,
per_page: 100
});

// Find the PR image version
const prVersion = versions.find(version =>
version.metadata.container.tags.includes(prTag)
);

if (prVersion) {
console.log(`Deleting PR image: ${prTag} (version ID: ${prVersion.id})`);
await github.rest.packages.deletePackageVersionForOrg({
package_type: 'container',
package_name: packageName,
org: owner,
package_version_id: prVersion.id
});
console.log(`Successfully deleted PR image: ${prTag}`);
} else {
console.log(`No image found for PR: ${prTag}`);
}
} catch (error) {
console.log(`Error cleaning up PR image (this is normal if no image was built): ${error.message}`);
}
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Changelog

All notable changes to this project will be documented in this file.

## [0.1.0] - 2025-09-30

### Added
- Chat functionality
Loading