NG12 Cancer Risk Assessor & Chatbot

A Clinical Decision Support System and Conversational Agent powered by Google Vertex AI (Gemini 1.5) and RAG using the NICE NG12 Guidelines.

Features

Risk Assessment: Evaluates patient symptoms against NG12 guidelines to determine referral urgency.
Evidence-Based: Uses a RAG pipeline to retrieve and cite specific sections of the NG12 PDF.
Conversational Interface: Chat with the guidelines to ask follow-up questions.
Modular Architecture: FastAPI backend, ChromaDB vector store, and a clean Vector-based frontend.

Prerequisites

Python 3.11+
Google Cloud Project with Vertex AI enabled.
Valid GOOGLE_APPLICATION_CREDENTIALS (or gcloud auth application-default login).

Setup

Environment Setup (Windows): I have created a virtual environment for you. Activate it or run commands via the path:
```
.\venv\Scripts\activate
```
Google Cloud Auth: Authenticate with your specific Google Cloud project.
```
gcloud auth application-default login --project <YOUR_PROJECT_ID>
```
Data Ingestion: Run the ingestion script using the virtual environment python:
```
.\venv\Scripts\python -m app.services.ingestion_service
```
Note: Ensure your project has the Vertex AI API enabled.

Run the Application:

.\venv\Scripts\uvicorn app.main:app --reload

Access the UI: Open http://localhost:8000 in your browser.

Docker Build

docker build -t ng12-assessor .
docker run -p 8080:8080 -e GOOGLE_APPLICATION_CREDENTIALS=/path/to/creds.json ng12-assessor

Project Structure

app/api: FastAPI routes.
app/services: Business logic (Agent, RAG, Patient data).
app/data: Local storage for PDF and Vector DB.
app/static: Frontend HTML/JS.

Architectural Decisions & Tradeoffs

1. Model: Gemini 2.0 Flash (Experimental)

Choice: Switched from Gemini 1.5 Pro to Gemini 2.0 Flash.
Reason: 2.0 Flash offers extremely low latency and a massive context window (1M+ tokens), making it ideal for interactive chat and processing large guidelines.
Tradeoff: slightly less "deep reasoning" capability than the Ultra/Opus class models, but for guideline retrieval, speed and context retrieval are more important.

2. Backend: FastAPI (Async)

Choice: Built on FastAPI with uvicorn.
Reason: LLM and RAG operations are I/O bound. FastAPI's native async/await support allows handling multiple concurrent chat requests without blocking, unlike Flask.
Tradeoff: Slightly more boilerplate than Flask, but essential for scalable AI apps.

3. Vector Store: ChromaDB (Local)

Choice: Used ChromaDB with local file persistence.
Reason: "Batteries-included" solution that requires no external infrastructure or API keys (unlike Pinecone), making the project easy to clone and run.
Tradeoff: Not suitable for production scaling to millions of documents. For production, we would migrate to Vertex AI Vector Search.

4. RAG Implementation: Query Rewriting

Choice: Implemented a "Condense Question" step where the LLM rewrites user queries based on history (e.g., "And for lung?" -> "What are the referral criteria for lung cancer?").
Reason: Essential for multi-turn chat. Without it, RAG fails on follow-up questions that lack explicit keywords.
Tradeoff: Adds a small latency overhead (one extra LLM call per turn), but drastically improves answer quality.

5. Ingestion: Batched Processing

Choice: Implemented manual batching (100 items/batch).
Reason: Vertex AI Embedding API has a hard limit of 250 instances per request.
Tradeoff: Code complexity vs. API reliability.

6. Embedding Model: text-embedding-004

Choice: Used text-embedding-004.
Reason: Latest stable embedding model offering improved semantic representations compared to older gecko models.
Tradeoff: Specific regional availability (us-central1), requiring explicit location configuration.

7. Chunk Size: 500 Characters

Choice: Split PDF into 500 character chunks (with 200 overlap).
Reason: Smaller chunks provide more precise context retrieval for specific medical criteria, reducing noise in the LLM prompt.
Tradeoff: Risk of splitting a long sentence or list across chunks, handled partially by the 200-character overlap.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
PROMPTS.md		PROMPTS.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NG12 Cancer Risk Assessor & Chatbot

Features

Prerequisites

Setup

Docker Build

Project Structure

Architectural Decisions & Tradeoffs

1. Model: Gemini 2.0 Flash (Experimental)

2. Backend: FastAPI (Async)

3. Vector Store: ChromaDB (Local)

4. RAG Implementation: Query Rewriting

5. Ingestion: Batched Processing

6. Embedding Model: text-embedding-004

7. Chunk Size: 500 Characters

About

Uh oh!

Releases

Packages

Languages

VandanaJn/ng12-risk-assessor

Folders and files

Latest commit

History

Repository files navigation

NG12 Cancer Risk Assessor & Chatbot

Features

Prerequisites

Setup

Docker Build

Project Structure

Architectural Decisions & Tradeoffs

1. Model: Gemini 2.0 Flash (Experimental)

2. Backend: FastAPI (Async)

3. Vector Store: ChromaDB (Local)

4. RAG Implementation: Query Rewriting

5. Ingestion: Batched Processing

6. Embedding Model: text-embedding-004

7. Chunk Size: 500 Characters

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages