LexMed is a proof-of-concept clinical AI assistant that enables doctors to query patient imaging data using natural language.
The system ingests DICOM studies (from TCIA or local files), summarizes them into structured text, embeds them into a Chroma vector database, and uses a Large Language Model (LLM) to provide grounded answers.
- 📥 DICOM ingestion from TCIA via
tcia_utils. - 🔎 Metadata extraction with
pydicom. - 🧠 Embeddings using OpenAI’s
text-embedding-3-large. - 📚 Vector search with ChromaDB (local or cloud-hosted).
- 💬 Question answering with
ChatOpenAI(RAG pipeline). - ⚡ FastAPI backend with clean REST endpoints.
- ☁️ Cloud-ready – deploy with Docker and persistent volumes.
lex_med/
├── utils.py # DICOM download, summarization, and metadata extraction
├── orchestrator.py # Orchestrates embeddings, vector DB, and LLM
├── model_utils.py # LLM builder and schema models
├── main.py # FastAPI app entrypoint with endpoints
git clone https://github.com/your-username/lexmed.git
cd lexmedUsing Poetry:
poetry installOr with pip:
pip install -r requirements.txtMinimal dependencies:
chromadb
langchain-openai
langchain-core
pydantic
pydicom
tcia-utils
fastapi
starlette
uvicorn
tqdm
python-dotenv
Create a .env file in the project root:
OPENAI_API_KEY=sk-xxxxuvicorn main:app --reload --host 0.0.0.0 --port 8000| Method | Endpoint | Description |
|---|---|---|
| GET | /collections |
List available TCIA collections |
| POST | /patients |
Download & embed patient data from a collection |
| POST | /ask |
Ask a question, get an LLM-based answer |
curl -X POST http://localhost:8000/patients -H "Content-Type: application/json" -d '{"name": "TCGA-BRCA"}'curl -X POST http://localhost:8000/ask -H "Content-Type: application/json" -d '{"question": "What MRI study was performed for patient TCGA-05-4245?"}'Response:
{
"answer": "The MRI study for patient TCGA-05-4245 was described as 'Breast MRI with contrast' performed on 2021-03-10."
}- Python 3.11+
- FastAPI + Uvicorn
- LangChain (
langchain-openai,langchain-core) - ChromaDB
- pydicom, tcia_utils
- tqdm, python-dotenv
- Add authentication for patient data security (HIPAA/GDPR compliance).
- Improve retrieval with patient-level metadata filters.
- Add evaluation metrics and tests.
- Containerize with Docker.
- Optional: voice interface (ElevenLabs) or MCP compatibility.
👤 Author: Your Name
📧 Contact: your.email@example.com
🔗 GitHub: yourusername