This section is for: Bioinformaticians, computational biologists, translational researchers, and cancer researchers who want to use the precision-medicine-mcp platform for multi-modal data analysis.
This platform doesn't just do existing analyses faster — it enables multi-modal synthesis that manual workflows struggle to achieve:
| Previously Difficult | Now Possible |
|---|---|
| Spatial transcriptomics analyzed separately from proteomics | Correlate spatial gene expression with protein phosphorylation in a single query |
| Weeks to cross-reference genomic variants with pathway enrichment across modalities | Integrate VCF variants, RNA, protein, and spatial data into concordant pathway activation maps |
| Cell deconvolution disconnected from patient clinical history | Link spatial cell-type composition to treatment response using FHIR clinical data |
| Manual literature review for treatment options | Automated cross-referencing with PubMed, bioRxiv, ClinicalTrials.gov, and TCGA cohorts |
| Complex bioinformatics tools require coding expertise | Access all tools via natural language — ask questions, get results |
All analyses use peer-reviewed statistical methods (Mann-Whitney U, Fisher's exact, Moran's I, Stouffer's Z, ComBat). All results require clinician APPROVE/REVISE/REJECT before clinical use.
graph LR
A["📁 Load Data<br/>5 modalities"] --> B["🔬 Analyze<br/>6 workflows"] --> C["🔗 Integrate<br/>cross-modal"] --> D["📊 Publish<br/>reproducible results"]
style A fill:#e1f5ff,stroke:#0066cc,stroke-width:2px
style B fill:#fff3cd,stroke:#ffc107,stroke-width:2px
style C fill:#e8f5e9,stroke:#4caf50,stroke-width:2px
style D fill:#f3e5f5,stroke:#9c27b0,stroke-width:2px
Load: Clinical, Genomic, Multi-omics, Spatial, Imaging Analyze: Diff expression, pathway enrichment, cell deconvolution, batch correction, spatial autocorrelation, multi-omics integration Integrate: RNA + Protein + Spatial + Clinical in one analysis Publish: Visualizations, statistical tests, treatment targets, reproducible methods
graph LR
A[📁 Multi-Modal Data] --> B[🔧 Analysis Workflows]
B --> C[📊 Insights]
A1[Clinical<br/>FHIR] --> A
A2[Genomics<br/>VCF] --> A
A3[RNA/Protein<br/>Phospho] --> A
A4[Spatial<br/>Visium] --> A
A5[Imaging<br/>H&E/MxIF] --> A
B --> B1[Diff Expression]
B --> B2[Pathway Enrichment]
B --> B3[Cell Deconvolution]
B --> B4[Batch Correction]
B --> B5[Multi-omics Integration]
C --> C1[Treatment Targets]
C --> C2[Resistance Mechanisms]
C --> C3[Biomarkers]
style A fill:#e1f5ff,stroke:#0066cc,stroke-width:2px
style B fill:#fff3cd,stroke:#ffc107,stroke-width:2px
style C fill:#d4edda,stroke:#28a745,stroke-width:2px
Three synthetic patient datasets are available:
| Patient | Condition | Key Data | Use Case |
|---|---|---|---|
| PAT001-OVC-2025 | Ovarian (HGSOC) Stage IV | BRCA1, TP53, PIK3CA; 900 Visium spots, 7 tissue regions | Advanced refractory cancer |
| PAT002-BC-2026 | Breast (IDC) Stage IIA | BRCA2, PIK3CA, ER+/PR+; ESR1/PGR spatially clustered (Moran's I = 0.42–0.45) | Adjuvant therapy surveillance |
| PAT003-CVD-2026 | Preventive cardiovascular | 67F post-menopausal; cardiometabolic biomarker panel; Helix Tier 1 (negative); Reynolds 14.3%, Framingham 12.0%, ASCVD 10.3% | Risk reclassification, gap analysis |
PatientOne (PAT001) Example Datasets:
| Modality | Demonstration Mode | Production Mode |
|---|---|---|
| Clinical | FHIR resources (demographics, conditions, medications, CA-125) | Real Epic FHIR (HIPAA-compliant) |
| Genomics | VCF: TP53, PIK3CA, PTEN, BRCA1 variants | Whole exome sequencing (WES) |
| Multi-omics | 15 samples, 38 KB matrices | 15 samples, 2.7 GB raw (15-20 MB processed) |
| Spatial | 900 spots × 31 genes (315 KB) | 3,000-5,000 spots × 18,000-30,000 genes (100-500 MB) |
| Imaging | Synthetic H&E, MxIF demo data (4.1 MB) | Full resolution slides (500 MB - 2 GB) |
PatientTwo (PAT002) adds pre/post treatment comparison, BRCA2 germline testing, and ER/PR/HER2 receptor workflows. See PAT002 README.
Synthetic Data: 100% synthetic, no patient privacy concerns, safe for publication
v17 update (April 2026): PAT003 (preventive CVD) added as a third synthetic dataset. New server:
mcp-cardiometabolic— Reynolds/Framingham/ASCVD risk scoring, biomarker panels, Lp(a) assessment, ACC/AHA statin decision logic. Three open research questions added (OPEN_QUESTIONS.md #11–13).
Goal: Comprehensive multi-modal analysis of Stage IV ovarian cancer case
- Set up environment → Installation Guide (10 min)
- Run PatientOne workflow → PatientOne Guide (25-35 min)
- Review results → outputs vary by server (visualizations, statistical summaries, PDF reports)
What you'll analyze:
- Clinical: Stage IV HGSOC, platinum-resistant
- Genomic: TP53 mutation, BRCA1 variant
- Multi-omics: RNA + Protein + Phospho (Stouffer meta-analysis)
- Spatial: Visium spatial transcriptomics with pathway enrichment
- Imaging: H&E slides, cell segmentation
Cost: ~$87 (compute + API tokens)
Goal: Deep dive into spatial, multi-omics, or genomic analysis
Spatial Transcriptomics:
- Server: mcp-spatialtools (95% real, 14 tools)
- Quick start: Spatial Quick Start (15 min)
- Capabilities: STAR alignment, ComBat batch correction, pathway enrichment, Moran's I
- Example: "Perform spatial pathway enrichment on PatientOne tumor regions"
Multi-Omics Integration:
- Server: mcp-multiomics (95% real, 10 tools)
- Examples: Multi-omics README (10 min)
- Capabilities: HAllA integration, Stouffer meta-analysis, upstream regulators
- Example: "Integrate RNA, protein, and phospho data using Stouffer's method"
Genomic Variants:
- Server: mcp-fgbio (95% real, 4 tools)
- Examples: fgbio README (10 min)
- Capabilities: VCF validation, variant annotation, reference genome management
- Example: "Identify pathogenic variants in PatientOne VCF file"
Goal: Build reproducible analysis pipelines for your research
- Understand architecture → System Overview (30 min)
- Study server capabilities → Server Status Matrix (15 min)
- Design workflow → Chain tools via natural language prompts
- Test with synthetic data → Use DRY_RUN mode ($0.32/analysis)
- Scale to real data → Switch to production mode
Example workflows:
- Tumor microenvironment characterization (Spatial + Imaging + Deconvolution)
- Drug resistance mechanisms (Multi-omics + Pathway enrichment + Variant analysis)
- Biomarker discovery (Cohort analysis + Differential expression + Validation)
Method: Mann-Whitney U test + Benjamini-Hochberg FDR correction
Prompt:
Identify differentially expressed genes between PatientOne tumor and normal samples
using mcp-spatialtools, with FDR < 0.05 threshold.
Output:
- List of significant genes (q-value < 0.05)
- Log2 fold changes
- Visualization (volcano plot, heatmap)
Statistical rigor:
- Non-parametric test (no normality assumption)
- Multiple testing correction (FDR control)
- Effect size (log2 FC) reported
Method: Fisher's exact test on 44 curated pathways (KEGG, Hallmark, GO_BP, Drug_Resistance)
Prompt:
Perform pathway enrichment analysis on PatientOne spatial transcriptomics data
focusing on cancer-related pathways.
Output:
- Enriched pathways (p < 0.05)
- Gene lists per pathway
- Overlap statistics
Pathways included:
- KEGG: PI3K/AKT, MAPK, p53, Cell cycle
- Hallmark: EMT, Hypoxia, Angiogenesis
- GO_BP: DNA repair, Apoptosis
- Drug_Resistance: Platinum, PARP inhibitor
Method: Moran's I for spatially variable genes
Prompt:
Identify spatially variable genes in PatientOne tumor regions using Moran's I
spatial autocorrelation test.
Output:
- Genes with significant spatial patterns (p < 0.05)
- Moran's I statistic per gene
- Spatial expression maps
Use cases:
- Tumor-normal boundary identification
- Microenvironment heterogeneity mapping
- Immune infiltration patterns
Method: Signature-based scoring (tumor, fibroblasts, immune cells, hypoxic regions)
Prompt:
Perform cell type deconvolution on PatientOne spatial data to quantify
tumor cell fraction and immune infiltration.
Output:
- Cell type proportions per spot
- Spatial distribution maps
- Cell type co-localization analysis
Signatures:
- Tumor: Epithelial markers (EPCAM, KRT7)
- Fibroblasts: Stromal markers (FAP, COL1A1)
- Immune: T-cells (CD3D, CD8A), Macrophages (CD68)
- Hypoxic: HIF1A, CA9, VEGFA
Method: ComBat (Empirical Bayes) for removing technical variation
Prompt:
Apply ComBat batch correction to PatientOne spatial data to remove
technical variation between tissue regions.
Output:
- Corrected expression matrix
- PCA plots (before/after)
- Batch effect assessment
When to use:
- Multiple tissue sections
- Multi-site studies
- Technical replicates
Method: HAllA association analysis + Stouffer meta-analysis
Prompt:
Integrate PatientOne RNA, protein, and phospho data using Stouffer's method
to identify concordant pathway activations.
Output:
- Combined p-values across modalities
- Concordant pathway list
- Cross-modal correlation analysis
Statistical approach:
- Stouffer's Z-score method
- FDR correction across modalities
- Directionality preserved
Six external MCP servers complement the custom servers for real-world data access:
| Data Need | External Server | Tools | Type |
|---|---|---|---|
| Real TCGA/cancer genomics | cBioPortal | 12 | Community (self-hosted) |
| Literature search | PubMed | 5 | Anthropic connector |
| Preprint search | bioRxiv & medRxiv | 9 | Anthropic connector |
| Clinical trial matching | ClinicalTrials.gov | 6 | Anthropic connector |
| Nextflow pipelines | Seqera | 7 | Anthropic connector |
| ML models/datasets | Hugging Face | 7 | Community (self-hosted) |
Setup & details: Connect External MCP Servers
Mock ↔ Real alternatives: mcp-mocktcga returns synthetic data — use cBioPortal for real TCGA queries. mcp-mockepic returns synthetic EHR — use mcp-epic for real FHIR access.
Full server details: See Platform Overview for the complete server status matrix.
Quick Summary: Most servers production-ready.
📋 See Server Registry → for complete status matrix, tool-by-tool details, test coverage, and DRY_RUN mode behavior.
Current: Most servers production-ready — see Server Registry for details
Next (3-6 months):
- mcp-mocktcga → mcp-tcga: Wire up real GDC API for TCGA cohort data
Future (6-12 months):
- New servers: Metabolomics, radiomics, single-cell
Per-patient cost ranges from minimal (DRY_RUN demo) to low per-analysis compute cost (production), representing a significant modeled cost reduction vs. traditional methods (pending clinical validation).
Full cost analysis: See Cost Analysis and Value Proposition for detailed breakdowns.
The platform enables tumor microenvironment characterization, drug resistance mechanism discovery, biomarker validation, patient stratification workflows, and — via the cardiometabolic server — CVD risk reclassification research and gap-test prioritisation studies. All workflows use reproducible methods with automated tracking of tool versions, parameters, and data provenance. See OPEN_QUESTIONS.md for 13 specific research questions the platform is positioned to help answer (questions 11–13 address preventive cardiovascular use cases added in v17).
Objective: Map spatial organization of tumor, stromal, and immune compartments
Data requirements:
- Spatial transcriptomics (Visium or similar)
- H&E histology (optional, for validation)
- Clinical annotations
Workflow:
1. Load spatial data → mcp-spatialtools.get_spatial_data_for_patient()
2. Cell type deconvolution → mcp-spatialtools.deconvolve_cell_types()
3. Spatial autocorrelation → mcp-spatialtools.calculate_spatial_autocorrelation()
4. Pathway enrichment by region → mcp-spatialtools.perform_pathway_enrichment()
5. Visualization → mcp-spatialtools.generate_spatial_heatmap()
Publications enabled:
- Spatial heterogeneity studies
- Immune infiltration patterns
- Tumor-stroma interactions
- Treatment response prediction
Objective: Identify pathways and genes associated with treatment resistance
Data requirements:
- Multi-omics (RNA, protein, phospho)
- Clinical treatment history
- Genomic variants (optional)
Workflow:
1. Load multi-omics → mcp-multiomics.integrate_omics_data()
2. Stratify by response → Use clinical data to group responders vs. non-responders
3. Association analysis → mcp-multiomics.run_halla_analysis()
4. Pathway analysis → mcp-multiomics.predict_upstream_regulators()
5. Validate with genomics → mcp-fgbio.query_gene_annotations()
Publications enabled:
- Resistance biomarker discovery
- Mechanism-of-action studies
- Combination therapy rationale
- Clinical trial stratification
Objective: Identify prognostic or predictive biomarkers for clinical outcomes
Data requirements:
- Discovery cohort (multi-modal data)
- Validation cohort (independent dataset)
- Clinical outcomes (survival, response)
Workflow:
Discovery:
1. Feature selection → mcp-multiomics.run_halla_analysis()
2. Pathway analysis → mcp-multiomics.predict_upstream_regulators()
3. Candidate biomarkers → Top genes/pathways
Validation:
4. Load validation cohort → mcp-mocktcga.query_tcga_cohorts()
5. Test biomarkers → Statistical validation
6. Clinical correlation → Link to outcomes
Publications enabled:
- Biomarker validation studies
- Prognostic signature development
- Clinical utility assessment
- Regulatory submission support
Objective: Identify molecular subtypes for precision treatment
Data requirements:
- Cohort of 50-200 patients
- Multi-modal data (clinical, genomic, multi-omics)
- Treatment and outcome data
Workflow:
1. Load cohort data → mcp-multiomics.integrate_omics_data()
2. Dimensionality reduction → PCA, UMAP
3. Clustering → Identify subtypes
4. Characterize subtypes → Pathway enrichment per cluster
5. Clinical association → Link subtypes to outcomes
Publications enabled:
- Molecular subtype discovery
- Precision treatment stratification
- Clinical trial design
- Companion diagnostic development
All analyses include:
- Tool versions (server commits, library versions)
- Parameters used (thresholds, methods, corrections)
- Random seeds (where applicable)
- Data provenance (file paths, checksums)
Example methods section:
Spatial pathway enrichment was performed using mcp-spatialtools
(version 0.3.0, commit abc123) with Fisher's exact test on 44
curated pathways (KEGG, Hallmark, GO_BP, Drug_Resistance).
FDR correction was applied using the Benjamini-Hochberg method
with α = 0.05. Spatial graphs were constructed using k=6
nearest neighbors.All analyses use peer-reviewed statistical methods:
- Differential expression: Mann-Whitney U (non-parametric)
- Multiple testing: Benjamini-Hochberg FDR
- Pathway enrichment: Fisher's exact test
- Spatial autocorrelation: Moran's I
- Meta-analysis: Stouffer's Z-score method
- Batch correction: ComBat (Empirical Bayes)
Synthetic data (PatientOne):
- Fully available in repository
- 100% synthetic, no patient identifiers
- Safe for publication and sharing
- DOI: [To be assigned upon publication]
Real patient data:
- Not included in repository
- Comply with institutional IRB requirements
- HIPAA de-identification built-in (mcp-epic)
- Follow FAIR principles (Findable, Accessible, Interoperable, Reusable)
"What tools are available for spatial transcriptomics analysis?"
"Load PatientOne spatial data and summarize the dataset (number of spots, genes, tissue regions)."
"Identify the top 10 most variable genes in PatientOne spatial data."
"Perform differential expression analysis comparing PatientOne tumor vs. normal regions,
using Mann-Whitney U test with FDR < 0.05."
"Run pathway enrichment on upregulated genes in PatientOne tumor, focusing on
cancer-related KEGG pathways."
"Integrate PatientOne RNA and protein data using Stouffer meta-analysis to identify
concordant pathway activations."
"Apply ComBat batch correction to PatientOne spatial data to remove technical variation
between tissue sections."
"Identify spatially variable genes in PatientOne tumor using Moran's I spatial
autocorrelation (p < 0.05)."
"Perform comprehensive multi-modal analysis for PatientOne:
1. Load clinical data (demographics, diagnoses, medications)
2. Analyze genomic variants (TP53, BRCA1 status)
3. Integrate multi-omics (RNA, protein, phospho)
4. Analyze spatial transcriptomics (pathway enrichment)
5. Synthesize results into treatment recommendations"
"Compare PatientOne's spatial pathway enrichment profile to TCGA ovarian cancer cohort
to identify shared and unique pathway activations."
"Perform cell type deconvolution on PatientOne spatial data and correlate immune
infiltration with spatial pathway enrichment scores."
A: Yes! The platform is designed for research use. Synthetic PatientOne data is 100% safe to publish. For real patient data, ensure IRB approval and follow institutional guidelines.
A: All methods are peer-reviewed and documented:
- Differential expression: Mann-Whitney U
- Multiple testing: Benjamini-Hochberg FDR
- Pathway enrichment: Fisher's exact test
- Spatial autocorrelation: Moran's I
- Meta-analysis: Stouffer's Z-score
- Batch correction: ComBat
A: Citation information will be provided upon publication. For now, reference the GitHub repository and specific tool versions used.
A: Yes! See Extending Servers for how to add custom analysis tools. Current pathway databases: KEGG, Hallmark, GO_BP, Drug_Resistance.
A:
- Clinical: FHIR JSON
- Genomics: VCF, BAM, FASTQ
- Multi-omics: CSV matrices (samples × features)
- Spatial: 10x Visium format, Seurat objects
- Imaging: TIFF, PNG
A: Platform automatically tracks:
- Tool versions (server commits)
- Parameters used
- Data provenance (file paths, checksums)
- Random seeds