Skip to content

OmarJabri7/LexMed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🩺 LexMed – AI-Powered Clinical Assistant (PoC)

LexMed is a proof-of-concept clinical AI assistant that enables doctors to query patient imaging data using natural language.

The system ingests DICOM studies (from TCIA or local files), summarizes them into structured text, embeds them into a Chroma vector database, and uses a Large Language Model (LLM) to provide grounded answers.

⚠️ Disclaimer: This project is for research and demonstration purposes only. It is not a medical device and must not be used for clinical decision making.


✨ Key Features

  • 📥 DICOM ingestion from TCIA via tcia_utils.
  • 🔎 Metadata extraction with pydicom.
  • 🧠 Embeddings using OpenAI’s text-embedding-3-large.
  • 📚 Vector search with ChromaDB (local or cloud-hosted).
  • 💬 Question answering with ChatOpenAI (RAG pipeline).
  • FastAPI backend with clean REST endpoints.
  • ☁️ Cloud-ready – deploy with Docker and persistent volumes.

🗂️ Project Layout

lex_med/
├── utils.py          # DICOM download, summarization, and metadata extraction
├── orchestrator.py   # Orchestrates embeddings, vector DB, and LLM
├── model_utils.py    # LLM builder and schema models
├── main.py           # FastAPI app entrypoint with endpoints

🚀 Getting Started

1. Clone the repo

git clone https://github.com/your-username/lexmed.git
cd lexmed

2. Install dependencies

Using Poetry:

poetry install

Or with pip:

pip install -r requirements.txt

Minimal dependencies:

chromadb
langchain-openai
langchain-core
pydantic
pydicom
tcia-utils
fastapi
starlette
uvicorn
tqdm
python-dotenv

3. Environment variables

Create a .env file in the project root:

OPENAI_API_KEY=sk-xxxx

4. Run the API

uvicorn main:app --reload --host 0.0.0.0 --port 8000

🔌 API Endpoints

Method Endpoint Description
GET /collections List available TCIA collections
POST /patients Download & embed patient data from a collection
POST /ask Ask a question, get an LLM-based answer

🧪 Example Usage

1. Add patients from TCIA

curl -X POST http://localhost:8000/patients   -H "Content-Type: application/json"   -d '{"name": "TCGA-BRCA"}'

2. Ask a question

curl -X POST http://localhost:8000/ask   -H "Content-Type: application/json"   -d '{"question": "What MRI study was performed for patient TCGA-05-4245?"}'

Response:

{
  "answer": "The MRI study for patient TCGA-05-4245 was described as 'Breast MRI with contrast' performed on 2021-03-10."
}

🛠️ Tech Stack

  • Python 3.11+
  • FastAPI + Uvicorn
  • LangChain (langchain-openai, langchain-core)
  • ChromaDB
  • pydicom, tcia_utils
  • tqdm, python-dotenv

📌 Next Steps

  • Add authentication for patient data security (HIPAA/GDPR compliance).
  • Improve retrieval with patient-level metadata filters.
  • Add evaluation metrics and tests.
  • Containerize with Docker.
  • Optional: voice interface (ElevenLabs) or MCP compatibility.

👤 Author: Your Name
📧 Contact: your.email@example.com
🔗 GitHub: yourusername

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors