This document adds a parallel local deployment path for
pathology_review_backend = "pathology_ai_api" while preserving the existing
openai workflow option.
pathology_review_backend = "openai"remains valid and unchanged.pathology_review_backend = "pathology_ai_api"still points to an HTTP service atpathology_ai_api_base_url.- Cluster cell-type annotation can optionally use the same local service with
cluster_annotation_backend = "pathology_ai_api"andcluster_annotation_llm_base_url. - The public
spathoworkflow JSON schema does not need PDC-specific fields; the local annotation knobs are regular portable workflow fields.
The PDC-oriented stack consists of:
pathology-ai: the lightweight HTTP orchestration layer in this repovllm: an OpenAI-compatible local LLM endpointembedder: a TEI-compatible Python service forBAAI/bge-m3reranker: a TEI-compatible Python service forBAAI/bge-reranker-v2-m3qdrant: local vector storage for chunk retrieval
Default values:
LLM_MODEL=openai/gpt-oss-120bEMBED_MODEL=BAAI/bge-m3RERANK_MODEL=BAAI/bge-reranker-v2-m3VECTOR_DB=qdrantDEFAULT_TOP_K=6STRICT_JSON=true
Use this path on Dardel GPU nodes. PDC login nodes do not provide Docker
Compose, and the Hugging Face TEI cpu-1.9 image is amd64-only. The PDC path
therefore uses Slurm plus Apptainer sandboxes and replaces TEI with small Python
HTTP services that expose the same /embed, /rerank, and /health endpoints
used by pathology-ai.
Defaults:
- Current Dardel Slurm account:
naiss2026-4-680-gpu - Current Dardel Slurm partition:
gpu - Runtime root:
/cfs/klemming/projects/supr/naiss2023-23-563/pathology-ai - vLLM GPUs:
CUDA_VISIBLE_DEVICES=0,1 - embedder GPU:
CUDA_VISIBLE_DEVICES=2 - reranker GPU:
CUDA_VISIBLE_DEVICES=3
The prepare script auto-detects the runtime image family:
x86_64Dardelgpunodes: ROCm,vllm/vllm-openai-rocm:latest, Apptainer--rocmaarch64GraceHopper nodes: CUDA,vllm/vllm-openai:latest, Apptainer--nv
Prepare the environment file from the repo root:
cp deploy/pathology_ai/pathology-ai.gpugh.env.example deploy/pathology_ai/pathology-ai.gpugh.envIf a Hugging Face token is needed for model downloads, add it outside git, for example in your shell before submitting:
export HF_TOKEN=...Build the Apptainer sandboxes. On current Dardel gpu, run this on the normal
login node so it builds x86_64 ROCm sandboxes:
ssh dardel.pdc.kth.se
cd /cfs/klemming/home/h/hutaobo/Agentic-Spatial-Pathologist
bash deploy/pathology_ai/pdc_prepare_gh200.shIf a gpugh partition is available and you want the GH200/CUDA path instead,
run the same command from ssh logingh.
The prepare script creates:
/cfs/klemming/projects/supr/naiss2023-23-563/pathology-ai/images/vllm-openai-rocm-latest
/cfs/klemming/projects/supr/naiss2023-23-563/pathology-ai/images/qdrant-latest
/cfs/klemming/projects/supr/naiss2023-23-563/pathology-ai/runtime.env
Submit the service job:
sbatch deploy/pathology_ai/pathology-ai.gpugh.sbatchCheck the allocated node and logs:
squeue -u "$USER" -n pathology-ai
tail -f /cfs/klemming/projects/supr/naiss2023-23-563/pathology-ai/logs/<job-id>/pathology-ai.logVerify health from PDC:
curl http://<allocated-node>:8000/healthSuccessful readiness means the response has:
{
"service": "pathology-ai",
"ready": true
}and all four components under components have "ok": true.
Use this path only on machines that support Docker Compose and GPU containers. It is kept for non-PDC local hosts and does not replace the PDC GH200 path.
From the repo root:
cp deploy/pathology_ai/pathology-ai.env.example deploy/pathology_ai/pathology-ai.env
docker compose -f deploy/pathology_ai/docker-compose.pdc.yml up --buildThe pathology-ai service will be available at:
http://localhost:8000
The service intentionally keeps the contract simple:
GET /healthPOST /documents/upsertPOST /reviewPOST /reviews/structurePOST /reviews/case
Compatibility aliases are also available under /v1/....
Single-document form:
{
"document_id": "who-lung-2021",
"title": "WHO Thoracic Tumours",
"text": "Long reference text...",
"source": "who",
"metadata": {
"edition": "2021"
}
}Batch form:
{
"documents": [
{
"document_id": "who-lung-2021",
"title": "WHO Thoracic Tumours",
"text": "Long reference text..."
}
]
}{
"question": "What pathology interpretation best matches this structure?",
"document_ids": ["who-lung-2021"],
"answer_language": "en",
"top_k": 6,
"entity_name": "Tumor-rich structure 4",
"evidence": {
"markers": ["EPCAM", "KRT19", "MUC1"],
"notes": "Polygon-linked H&E region shows gland-forming epithelium."
}
}The request body is the same shape as structure, but the question and
evidence represent whole-case context.
- If
dockerordocker composeis missing on PDC, use the GH200 Slurm path. - If
sbatch --test-onlyfails with an invalid partition, inspectsinfo -sand override withsbatch -A <account> -p <partition> .... - If
curl /healthreturnsready=false, inspect the component errors and the matching log file in$PDC_PATHOLOGY_AI_ROOT/logs/<job-id>/. - If model downloads fail with an authorization error, set
HF_TOKENbefore running the prepare or Slurm job. - If the runtime storage fills up, set
PDC_PATHOLOGY_AI_ROOTto another project path before running both prepare andsbatch.
If you want to keep the same architecture but stop using gpt-oss, change
LLM_MODEL and the vLLM model argument in the environment file or Slurm job.
The pathology-ai interface and spatho workflow contract stay unchanged.