Skip to content

hutaobo/pathology-evidence-ai

Repository files navigation

Pathology Evidence AI

CI Python 3.11+ License: PolyForm Noncommercial

Pathology Evidence AI is a local-first pathology evidence workspace. It helps you turn your own licensed pathology books or notes into AI-assisted answers with citation-backed evidence, a browser UI, a local API, and a CLI.

This repository is public, but it is not open source in the OSI sense. It is released under the PolyForm Noncommercial 1.0.0 license, so commercial use is not allowed.

Start Here

If you only read one section, read this one.

Clone the repo and run:

git clone https://github.com/hutaobo/pathology-evidence-ai.git
cd pathology-evidence-ai
bash scripts/deploy_portable.sh --allow-no-openai

Then open:

http://127.0.0.1:8000/ui

What you should have after that:

  • a working browser UI
  • a local FastAPI server
  • PostgreSQL with pgvector
  • a safe empty-library state if you have not added PDFs yet
  • local CLI commands such as pathology-client health

This path is designed to work even when:

  • you have no pathology PDFs yet
  • you do not have an OpenAI API key yet
  • you are cloning the repo onto a new machine for the first time

Generate a Private Workspace

If you do not want to work directly in this public repository, you can use it as a generator for a new private local project.

Example:

bash scripts/init_private_workspace.sh ~/Projects/my-private-pathology --name "My Private Pathology"

That command creates a new workspace folder with:

  • its own .env
  • its own config/library.local.yaml
  • a namespaced Docker Compose project name
  • isolated API and PostgreSQL host ports
  • namespaced CLI wrapper names such as pathology-client-my-private-pathology
  • an optional fresh git repository

Then you switch into the generated folder and deploy it there:

cd ~/Projects/my-private-pathology
bash scripts/deploy_portable.sh --allow-no-openai

This is the closest path to generating a new local project that behaves like the original private pathology-ai setup, but without carrying over public git history or shared runtime settings.

Five-Minute Onboarding

1. Prerequisites

You need:

  • Python 3.11+
  • Docker
  • Docker Compose

Quick check:

python3 --version
docker --version
docker compose version

On macOS, if Docker is installed but the daemon is not running, the deploy script will try to use Colima when available.

2. Run the one-command setup

bash scripts/deploy_portable.sh --allow-no-openai

What this script does:

  • creates .env from .env.example if needed
  • bootstraps .venv
  • starts PostgreSQL and the API container
  • builds a local empty or populated library index
  • installs the local pathology-client command
  • installs optional OpenClaw integration
  • opens the browser UI unless you pass --skip-browser

3. Verify that it worked

Browser:

http://127.0.0.1:8000/ui

CLI:

pathology-client health

Expected result:

{
  "ok": "true",
  "status": "live"
}

What You Can Do Immediately

Even without an OpenAI key, you can:

  • start the full local UI
  • upload PDFs
  • sync the local library
  • inspect library state
  • run retrieval previews instead of model-synthesized answers
  • create a synthetic demo library with fake educational PDFs

Useful commands:

pathology-client health
pathology-client ui
pathology-ingest create-demo-library
pathology-client library
pathology-client search --query "ductal carcinoma in situ"
pathology-client sync-library

Try a Demo Library

If you do not have pathology PDFs ready yet, generate a tiny synthetic demo corpus:

pathology-ingest create-demo-library
pathology-client search --query "mucinous carcinoma of the prostate"

The demo PDFs are fake educational notes created by this project. They are only meant to prove that ingestion, search, citations, and page previews are working.

Add Your Own PDFs

No corpus is bundled with this repository.

You must add your own lawfully obtained PDFs by one of these methods:

Option A: Copy PDFs into pathologybook/

cp /path/to/your/*.pdf pathologybook/
pathology-client sync-library

Option B: Upload from the browser UI

Open:

http://127.0.0.1:8000/ui

Then use the library panel to upload PDFs and trigger a sync.

Option C: Use a local config file

cp config/library.example.yaml config/library.local.yaml

Example:

library_name: my-pathology-library
pdf_root: pathologybook
documents:
  - specialty: breast
    path: Breast_Atlas.pdf
  - specialty: genitourinary
    path: GU_Review.pdf

Enable Model-Backed Answers

OpenAI-backed embeddings and /ask are optional.

If you do nothing, the project still works in retrieval-preview mode.

If you want model-backed answers:

  1. Set OPENAI_API_KEY
  2. restart the deployment or rerun the sync

Simplest path:

export OPENAI_API_KEY=your_key_here
bash scripts/deploy_portable.sh

If you already deployed without a key:

export OPENAI_API_KEY=your_key_here
bash scripts/deploy_portable.sh --skip-browser

After that you can ask:

pathology-client ask --question "What are the key features of mucinous carcinoma of the prostate?"

Common Commands

make doctor
make test
make ui
make portable-up
make portable-bundle

Direct script entrypoints:

bash scripts/doctor.sh
bash scripts/bootstrap.sh
bash scripts/init_private_workspace.sh ~/Projects/my-private-pathology --name "My Private Pathology"
bash scripts/deploy_portable.sh --allow-no-openai
bash scripts/package_portable.sh

Troubleshooting

Check your environment

bash scripts/doctor.sh

The browser UI does not open

Open it manually:

http://127.0.0.1:8000/ui

pathology-client is not found

Reinstall the local wrappers:

bash scripts/install_openclaw_integration.sh

No answers are generated

That is expected when:

  • your library is empty
  • your retrieval hits are too weak
  • OPENAI_API_KEY is not configured

In that case:

  • upload or copy more PDFs
  • run pathology-client sync-library
  • verify OPENAI_API_KEY if you want synthesized answers

What This Repository Contains

  • local PDF ingestion and chunking
  • pgvector-backed retrieval
  • a browser UI
  • a local CLI client
  • upload and sync flows for pathology PDFs
  • evidence guardrails for ungrounded medical answers
  • optional OpenClaw integration

It does not contain:

  • pathology textbooks
  • patient data
  • PHI
  • a clinical-grade decision engine
  • commercial-use rights

Project Layout

apps/
  pathology_api/      FastAPI service and browser UI
  pathology_client/   Local CLI client
  ingest_worker/      PDF analysis and chunk building CLI
  train_worker/       Training placeholders
  eval_worker/        Evaluation placeholders
src/pathology_ai/     Shared library code
config/               Local library config templates
data/                 Generated manifests, chunks, evals, and training outputs
pathologybook/        User-supplied PDFs only
openclaw/             OpenClaw-facing agent and workspace docs
scripts/              Bootstrap, deployment, and helper scripts
tests/                Unit and API tests

Safety and Intended Use

  • Review all outputs before using them in education, reporting, or research.
  • Treat citations as aids, not as final authority.
  • Do not use this as a substitute for expert pathology review.
  • Do not commit copyrighted textbooks, extracted page text, page previews, embeddings, PHI, or secrets.

Contributing

See CONTRIBUTING.md.

Security

See SECURITY.md.

About

Local-first pathology PDF evidence workspace with AI-assisted, citation-backed answers (PolyForm Noncommercial 1.0.0).

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors