Comprehensive gene enrichment and pathway analysis skill for the ToolUniverse ecosystem.
Perform publication-ready gene enrichment analysis using Over-Representation Analysis (ORA) and Gene Set Enrichment Analysis (GSEA). Integrates local computation via gseapy with ToolUniverse APIs (PANTHER, STRING, Reactome) for cross-validated results across 225+ gene set libraries.
| Capability | Method | Tool |
|---|---|---|
| GO Enrichment (BP, MF, CC) | ORA / GSEA | gseapy, PANTHER, STRING |
| KEGG Pathway Enrichment | ORA / GSEA | gseapy, STRING |
| Reactome Pathway Enrichment | ORA | gseapy, ReactomeAnalysis API |
| WikiPathways Enrichment | ORA | gseapy |
| MSigDB Hallmark Enrichment | ORA / GSEA | gseapy |
| Multiple Testing Correction | BH, Bonferroni, BY, Holm | statsmodels |
| Gene ID Conversion | Symbol/Ensembl/Entrez/UniProt | MyGene, STRING |
| Cross-Validation | Multi-source consensus | gseapy + PANTHER + STRING |
| Network Context | PPI enrichment | STRING |
| Multi-Organism | human, mouse, rat, fly, worm, yeast, zebrafish | All tools |
import gseapy
# Standard GO enrichment (ORA)
result = gseapy.enrichr(
gene_list=["TP53", "BRCA1", "EGFR", "MYC", "AKT1", "PTEN"],
gene_sets='GO_Biological_Process_2021',
organism='human',
outdir=None,
no_plot=True,
)
sig = result.results[result.results['Adjusted P-value'] < 0.05]
print(sig[['Term', 'Adjusted P-value', 'Overlap']].head(5))- 225+ Enrichr libraries via gseapy (GO, KEGG, Reactome, WikiPathways, MSigDB, DisGeNET, CellMarker, GTEx, ChEA, ENCODE, and more)
- PANTHER over-representation analysis (curated GO, PANTHER Pathways, Reactome)
- STRING functional enrichment (GO, KEGG, Reactome, WikiPathways, COMPARTMENTS, DISEASES, HPO)
- Reactome Analysis Service pathway overrepresentation
Human (9606), Mouse (10090), Rat (10116), Drosophila (7227), C. elegans (6239), Yeast (4932), Zebrafish (7955)
| File | Description |
|---|---|
SKILL.md |
Complete skill specification with 8-phase workflow, tool reference, response formats |
QUICK_START.md |
10 use case examples with copy-paste code |
test_gene_enrichment.py |
50-test comprehensive test suite (100% pass rate) |
README.md |
This file |
Total: 50 | Passed: 50 | Failed: 0 | Time: 63.4s
Pass Rate: 50/50 (100.0%)
| Phase | Tests | Description |
|---|---|---|
| gseapy ORA | 9 | GO BP/MF/CC, KEGG, Reactome, Hallmark, WikiPathways, multi-library, structure |
| GSEA | 3 | Preranked GO/KEGG, NES direction |
| PANTHER | 4 | GO BP/MF/CC, PANTHER Pathways |
| STRING | 3 | Functional enrichment, KEGG, PPI enrichment |
| Reactome API | 2 | Pathway enrichment, cross-validation |
| ID Conversion | 3 | MyGene batch, STRING map, enrichment after conversion |
| Multiple Testing | 3 | BH, Bonferroni, method comparison |
| Cross-Validation | 2 | GO BP consensus, KEGG consensus |
| Gene Lists | 4 | Immune, signaling, small (3), large (43) |
| Term Lookup | 2 | Specific GO term, specific KEGG pathway |
| GO Details | 2 | Term details, gene annotations |
| Pathway Details | 3 | Reactome, KEGG, WikiPathways |
| Organisms | 2 | Mouse gseapy, PANTHER mouse |
| Library Discovery | 2 | List libraries, version verification |
| BixBench | 4 | Adjusted p-val, most enriched, GSEA metabolic, enrichGO equivalent |
| Integration | 2 | Full pipeline, comparative enrichment |
This skill is designed to answer BixBench-style enrichment questions:
-
"What is the adjusted p-val for [GO term] from enrichGO enrichment analysis?"
- Use
gseapy.enrichr()with GO library, search results for specific term
- Use
-
"Using gseapy and [library], which pathway is most significantly enriched?"
- Use
gseapy.enrichr()with specified library, sort by Adjusted P-value
- Use
-
"In a KEGG pathway enrichment analysis, what is the p-value for the most enriched pathway?"
- Use
gseapy.enrichr()with KEGG_2021_Human, report top result
- Use
-
"What GO Biological Processes are enriched (BH adjusted p < 0.05)?"
- Use
gseapy.enrichr(), filter by Adjusted P-value < 0.05
- Use
-
"Compare enrichment using Bonferroni vs BH correction"
- Use
statsmodels.stats.multitest.multipletests()with both methods
- Use
- Required: gseapy, pandas, scipy, statsmodels
- ToolUniverse: PANTHER, STRING, Reactome, MyGene, GO, QuickGO, WikiPathways, KEGG tools
- Optional: numpy (for GSEA ranked lists)
pip install gseapy pandas scipy statsmodels