Copy-paste prompts for integrating RNA, protein, and phosphoproteomics data.
Servers used: mcp-multiomics (10 tools), mcp-fgbio (4 tools)
Objective: Load RNA, protein, and phospho data for a patient
Load multi-omics data for PatientOne (PAT001-OVC-2025):
- RNA: /data/patient-data/PAT001-OVC-2025/multiomics/rna_counts.csv
- Protein: /data/patient-data/PAT001-OVC-2025/multiomics/protein_abundance.csv
- Phospho: /data/patient-data/PAT001-OVC-2025/multiomics/phospho_abundance.csv
Summarize each dataset:
- Number of samples
- Number of features (genes/proteins)
- Sample groups (tumor vs. normal)
- Data quality metrics
Expected Output:
- RNA: 15 samples, 200 genes
- Protein: 15 samples, 150 proteins
- Phospho: 15 samples, 120 phosphosites
- Groups: 10 tumor, 5 normal samples
- Quality: All datasets normalized and QC-passed
Customize:
- Replace file paths with your data location
- Add specific QC checks:
Check for missing values, outliers
Objective: Check that multi-omics files are properly formatted
Validate the format of PatientOne multi-omics files:
For each file (RNA, protein, phospho), check:
- CSV format with headers
- Sample IDs match across modalities
- Numeric values (no text in data columns)
- No excessive missing data (< 20% NAs)
- Appropriate value ranges (log2 for RNA, raw for protein/phospho)
Report any formatting issues or inconsistencies.
Expected Output:
- ✓ All files: Valid CSV format
- ✓ Sample IDs: Consistent across modalities (15 samples)
- ✓ Data types: All numeric
- ✓ Missing data: <5% per modality
- ⚠ Note: RNA data is log2-transformed, protein/phospho are raw abundances
Customize:
- Adjust missing data threshold
- Add specific validation rules
Objective: Find genes differentially expressed between conditions
Perform differential expression analysis on PatientOne RNA data:
- Compare: Tumor vs. Normal samples
- Method: Mann-Whitney U test (non-parametric)
- Correction: Benjamini-Hochberg FDR
- Threshold: FDR < 0.05, log2FC > 1
Return:
- Top 20 upregulated genes (ranked by fold change)
- Top 20 downregulated genes
- Total significant genes
- Volcano plot visualization
Expected Output:
- Significant genes: 45 (23 up, 22 down)
- Top upregulated: TP53, MYC, VEGFA, ...
- Top downregulated: BRCA1, PTEN, ...
- Volcano plot: [visualization]
Customize:
- Change comparison:
Compare: Pre-treatment vs. Post-treatment - Adjust thresholds:
FDR < 0.01, log2FC > 1.5 - Change method:
Method: Welch's t-test
Objective: Find proteins differentially abundant between conditions
Perform differential abundance analysis on PatientOne protein data:
- Compare: Tumor vs. Normal samples
- Method: Mann-Whitney U test
- Correction: Benjamini-Hochberg FDR
- Threshold: FDR < 0.05, fold change > 1.5x
Return:
- Top 15 upregulated proteins
- Top 15 downregulated proteins
- Overlap with RNA findings (correlation)
Expected Output:
- Significant proteins: 28 (14 up, 14 down)
- Top upregulated: TP53, MYC, EGFR, ...
- RNA-protein correlation: 67% concordance
Customize:
- Adjust fold change:
fold change > 2x(more stringent) - Add protein families:
Focus on kinases and transcription factors
Objective: Identify differentially phosphorylated sites
Analyze phosphorylation changes in PatientOne:
- Compare: Tumor vs. Normal samples
- Method: Mann-Whitney U test
- Correction: FDR < 0.05
- Threshold: Fold change > 2x (phosphorylation often has larger changes)
Return:
- Top 10 hyperphosphorylated sites (increased phosphorylation)
- Top 10 hypophosphorylated sites (decreased phosphorylation)
- Kinase enrichment analysis (which kinases might be responsible)
Expected Output:
- Significant phosphosites: 18 (9 hyper, 9 hypo)
- Top hyper: AKT1_S473, mTOR_S2448, ERK1_T202, ...
- Enriched kinases: AKT, mTOR, MAPK pathway kinases
Customize:
- Add kinase prediction:
Use kinase-substrate databases (PhosphoSitePlus) - Focus on specific pathways:
Focus on PI3K/AKT/mTOR pathway
Objective: Combine p-values across RNA, protein, and phospho
Perform Stouffer meta-analysis on PatientOne multi-omics data:
- Modalities: RNA, Protein, Phospho
- Method: Stouffer's Z-score method (weighted by sample size)
- Correction: FDR < 0.05
- Direction: Report concordant changes (same direction across modalities)
Return:
- Top 20 genes/proteins with concordant activation
- Top 20 with concordant repression
- Combined p-values and effect sizes
Expected Output:
- Concordant activated: 12 genes (significant in ≥2 modalities, same direction)
- TP53: RNA ↑, Protein ↑ (combined p=1.2e-5)
- MYC: RNA ↑, Protein ↑, Phospho ↑ (combined p=3.4e-6)
- Concordant repressed: 10 genes
- BRCA1: RNA ↓, Protein ↓ (combined p=8.7e-4)
Customize:
- Require all 3 modalities:
Only report genes significant in all 3 modalities - Weight by effect size:
Weight by fold change, not just p-value
Objective: Find associations between RNA and protein abundances
Run HAllA (Hierarchical All-against-All association testing) on PatientOne data:
- Dataset 1: RNA expression
- Dataset 2: Protein abundance
- Method: Spearman correlation
- FDR: 0.1 (HAllA is exploratory)
Return:
- Top 10 RNA-protein associations
- Correlation coefficients and p-values
- Biological interpretation (known interactions vs. novel)
Expected Output:
- Significant associations: 23 RNA-protein pairs
- Top association: TP53_RNA ↔ TP53_Protein (r=0.82, p=1.2e-4)
- Novel finding: MYC_RNA ↔ AKT1_Protein (r=0.71, p=3.4e-3)
Customize:
- Add phospho:
3-way analysis: RNA, Protein, Phospho - Filter by pathway:
Focus on DNA repair pathway genes
Objective: Find pathways activated across multiple modalities
Perform pathway enrichment on PatientOne integrated multi-omics results:
- Input: Genes significant in Stouffer meta-analysis (FDR < 0.05)
- Databases: KEGG, Hallmark, GO_BP
- Method: Fisher's exact test
- Correction: Benjamini-Hochberg FDR
- Threshold: FDR < 0.05
Return:
- Top 10 enriched pathways
- Genes per pathway
- Overlap statistics (how many genes, expected vs. observed)
Expected Output:
- Enriched pathways (FDR < 0.05):
- PI3K/AKT/mTOR signaling (p=8.2e-5, 8 genes)
- DNA damage response (p=1.3e-4, 6 genes)
- Cell cycle checkpoints (p=4.7e-4, 5 genes)
- MAPK signaling (p=6.1e-3, 4 genes)
Customize:
- Add custom pathways:
Include drug resistance pathways - Focus on specific databases:
KEGG pathways only
Objective: Predict kinases responsible for phosphorylation changes
Identify upstream kinases driving PatientOne phosphorylation changes:
- Input: Differentially phosphorylated sites (FDR < 0.05)
- Method: Kinase-substrate enrichment analysis
- Databases: PhosphoSitePlus, Kinase Library
- Threshold: Enrichment p < 0.01
Return:
- Top 5 predicted active kinases
- Target phosphosites per kinase
- Evidence level (known substrates vs. predicted)
Expected Output:
- Predicted active kinases:
- AKT1 (p=1.2e-4, 5 known substrates hyperphosphorylated)
- mTOR (p=3.4e-4, 3 known substrates)
- ERK1/2 (p=6.7e-3, 2 known substrates)
Customize:
- Add kinase inhibitors:
For each kinase, suggest FDA-approved inhibitors - Filter by pathway:
Focus on PI3K/AKT/mTOR pathway kinases
Objective: Infer transcription factor activity from gene expression
Infer transcription factor (TF) activity from PatientOne RNA data:
- Input: Differentially expressed genes (FDR < 0.05)
- Method: TF target enrichment analysis
- Databases: ENCODE, ChEA
- Threshold: Enrichment p < 0.01
Return:
- Top 5 predicted active TFs
- Target genes per TF
- Concordance with protein changes (if TF protein is measured)
Expected Output:
-
Predicted active TFs:
- TP53 (p=2.1e-5, 12 target genes upregulated)
- MYC (p=4.3e-4, 8 target genes upregulated)
- HIF1A (p=8.6e-3, 5 target genes upregulated)
-
Concordance check:
- TP53: Protein mutated (activity loss expected, but targets still upregulated - paradox to investigate)
- MYC: Protein ↑ (concordant with activity prediction)
Customize:
- Add chromatin data:
Integrate with ATAC-seq to confirm open chromatin - Focus on druggable TFs:
Prioritize TFs with known inhibitors
Objective: Visualize multi-omics data together
Create a heatmap showing PatientOne multi-omics data:
- Rows: Top 50 genes from Stouffer meta-analysis
- Columns: Samples (tumor vs. normal) × Modalities (RNA, Protein, Phospho)
- Color scale: Z-scores (standardized within each modality)
- Clustering: Hierarchical clustering of genes and samples
Save as: /results/patientone_multiomics_heatmap.png
Expected Output:
- Heatmap showing concordant changes across modalities
- Tumor samples cluster separately from normal
- Gene clusters: Activated pathways, Repressed pathways
Customize:
- Filter by pathway:
Only show PI3K/AKT/mTOR pathway genes - Adjust clustering:
Use k-means with k=3 clusters
Objective: Visualize RNA-protein correlation
Create scatter plot comparing RNA vs. Protein for PatientOne:
- X-axis: RNA log2 fold change (tumor vs. normal)
- Y-axis: Protein fold change (tumor vs. normal)
- Points: All measured genes/proteins
- Highlight: Significant in both (FDR < 0.05)
- Add: Correlation coefficient (Spearman r)
Save as: /results/patientone_rna_protein_correlation.png
Expected Output:
- Correlation: r = 0.67 (moderate positive correlation)
- Concordant genes: 45 (both RNA and protein significant, same direction)
- Discordant genes: 12 (significant in one modality only)
Customize:
- Add labels:
Label top 10 genes with largest discordance - Filter by pathway:
Only show DNA repair genes
Objective: Find genes with protein changes not explained by RNA
Identify post-transcriptional regulation in PatientOne:
- Compare: RNA vs. Protein fold changes
- Threshold: |RNA FC| < 1.5 AND |Protein FC| > 2 (protein change without RNA change)
- Mechanism: Possible post-translational regulation (PTM, stability, translation)
Return:
- Genes with protein-only changes
- Phosphorylation status (if phospho data available)
- Predicted regulation mechanisms
Expected Output:
- Post-translationally regulated genes: 8
- AKT1: RNA FC=1.2 (NS), Protein FC=2.3 (p=0.003), Phospho ↑↑ (S473)
- Mechanism: Phosphorylation-mediated activation
- PTEN: RNA FC=0.9 (NS), Protein FC=0.4 (p=0.008)
- Mechanism: Protein degradation (ubiquitination?)
- AKT1: RNA FC=1.2 (NS), Protein FC=2.3 (p=0.003), Phospho ↑↑ (S473)
Customize:
- Adjust thresholds:
Protein FC > 3 for stronger effect - Add validation:
Check protein half-life databases
Objective: Analyze multi-omics changes over time (if longitudinal data)
If PatientOne has longitudinal samples (pre-treatment, on-treatment, post-treatment):
Analyze temporal changes across modalities:
- Time points: Pre, Week 4, Week 12, Progression
- Modalities: RNA, Protein, Phospho
- Method: Linear mixed models or spline fitting
- Focus: Genes with dynamic changes
Return:
- Early response genes (change by Week 4)
- Late resistance genes (change at progression)
- Temporal patterns (sustained activation, transient, etc.)
Expected Output:
- Early response (Week 4):
- Apoptosis genes ↑ (RNA + Protein)
- DNA repair genes ↓
- Resistance (Week 12):
- PI3K/AKT/mTOR ↑↑ (RNA, Protein, Phospho)
- Drug efflux pumps ↑ (RNA, Protein)
Customize:
- Add treatment information:
Correlate changes with drug exposure - Predict resistance:
Identify early markers of eventual resistance
Objective: Check for technical batch effects across modalities
Assess batch effects in PatientOne multi-omics data:
- Check for: Sample processing batch, run date, instrument
- Method: PCA, visualize PC1 vs. PC2 colored by batch
- Quantify: Variance explained by batch vs. biology
If batch effects detected, recommend correction method (ComBat, limma).
Expected Output:
- Batch effects detected: Minimal (< 5% variance)
- PCA: PC1 separates tumor/normal (biology), batch has minor effect
- Recommendation: No batch correction needed
Customize:
- Apply correction:
Apply ComBat batch correction to all modalities - Assess severity:
Calculate silhouette coefficient for batch separation
Objective: Understand patterns of missing data
Analyze missing data patterns in PatientOne multi-omics:
- RNA: % missing per gene, per sample
- Protein: % missing (common in proteomics)
- Phospho: % missing (often higher)
Identify:
- Genes with excessive missingness (> 30%)
- Samples with excessive missingness (> 20%)
- Whether missingness is random (MCAR) or systematic (MNAR)
Recommend imputation strategy if needed.
Expected Output:
- RNA: 2% missing (minimal)
- Protein: 15% missing (typical for proteomics)
- Phospho: 22% missing (low-abundance phosphosites)
- Pattern: MCAR (missing at random)
- Recommendation: Use k-nearest neighbors imputation for protein/phospho
Customize:
- Impute:
Impute missing values using KNN with k=5 - Filter:
Remove features with >50% missing data
Objective: Identify treatment opportunities from integrated data
From PatientOne multi-omics analysis, identify druggable targets:
- Input: Activated pathways (PI3K/AKT/mTOR, MAPK)
- Criteria:
* Significant across ≥2 modalities
* Fold change > 2
* Known drug targets (FDA-approved or clinical trials)
Return:
- Ranked list of drug targets
- Available drugs (approved, trial, preclinical)
- Evidence level (NCCN, clinical guidelines)
- Expected efficacy based on multi-omics signature
Expected Output:
-
mTOR (activated across RNA, Protein, Phospho)
- Drug: Everolimus (FDA-approved)
- Evidence: Level 2 (NCCN Category 2A for PI3K/mTOR activation)
- Expected response: 30-40% based on signature
-
AKT1 (hyperphosphorylated at S473)
- Drug: Capivasertib (clinical trials)
- Evidence: Phase III trials in AKT-activated cancers
- Expected response: 25-35%
Customize:
- Combination therapy:
Suggest drug combinations for activated pathways - Clinical trials: Use ClinicalTrials.gov connector to search for trials targeting activated pathways. Use PubMed for published efficacy data. See Connect External MCP.
"Sample IDs don't match across modalities"
- Check: Sample naming conventions (e.g., "Sample_01" vs. "Sample01")
- Solution: Standardize IDs before loading
- Prompt:
Rename samples to match: Sample_01 → Sample01
"Too many significant genes (> 1000)"
- Issue: Threshold too lenient or large effect sizes
- Solution: Increase stringency (FDR < 0.01) or add fold change filter
- Prompt:
Re-analyze with FDR < 0.01 and log2FC > 1.5
"No concordance between RNA and protein"
- Issue: Post-translational regulation, technical noise, or batch effects
- Solution: Check batch effects, assess protein data quality
- Prompt:
Assess batch effects and protein data quality for PatientOne
"Stouffer meta-analysis finds nothing"
- Issue: Effects in opposite directions across modalities
- Solution: Check individual modalities first, may be biology (PTM regulation)
- Prompt:
Compare RNA vs. Protein direction for top DE genes
Related Prompts:
- Clinical-Genomic Prompts - Link genomic variants to multi-omics
- Spatial Prompts - Add spatial context to omics
- Complete Workflows - Integrate everything
Last Updated: 2026-01-14