name	bio-pathway-go-enrichment
description	Gene Ontology over-representation analysis using clusterProfiler enrichGO. Use when identifying biological functions enriched in a gene list from differential expression or other analyses. Supports all three ontologies (BP, MF, CC), multiple ID types, and customizable statistical thresholds.
tool_type	r
primary_tool	clusterProfiler

Version Compatibility

Reference examples tested with: R stats (base), clusterProfiler 4.10+

Before using code patterns, verify installed versions match. If versions differ:

R: packageVersion('<pkg>') then ?function_name to verify parameters

If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.

GO Over-Representation Analysis

Core Pattern

Goal: Identify enriched Gene Ontology terms in a gene list from differential expression or similar analyses.

Approach: Test for over-representation of GO terms using the hypergeometric test via clusterProfiler enrichGO.

"Run GO enrichment on my gene list" → Test whether biological process, molecular function, or cellular component terms are over-represented among significant genes.

library(clusterProfiler)
library(org.Hs.eg.db)  # Human - change for other organisms

ego <- enrichGO(
    gene = gene_list,           # Character vector of gene IDs
    OrgDb = org.Hs.eg.db,       # Organism annotation database
    keyType = 'ENTREZID',       # ID type: ENSEMBL, SYMBOL, ENTREZID, etc.
    ont = 'BP',                 # BP, MF, CC, or ALL
    pAdjustMethod = 'BH',       # p-value adjustment method
    pvalueCutoff = 0.05,
    qvalueCutoff = 0.2
)

Prepare Gene List from DE Results

Goal: Extract significant gene IDs from differential expression results and convert to the format required by enrichGO.

Approach: Filter DE results by adjusted p-value and fold change, then convert gene symbols to Entrez IDs using bitr.

library(dplyr)

de_results <- read.csv('de_results.csv')

sig_genes <- de_results %>%
    filter(padj < 0.05, abs(log2FoldChange) > 1) %>%
    pull(gene_id)

# If using gene symbols, convert to Entrez IDs
gene_ids <- bitr(sig_genes, fromType = 'SYMBOL', toType = 'ENTREZID', OrgDb = org.Hs.eg.db)
gene_list <- gene_ids$ENTREZID

ID Conversion with bitr

Goal: Convert between gene identifier types (Ensembl, Symbol, Entrez) for compatibility with enrichment tools.

Approach: Use clusterProfiler bitr to map between ID types using organism annotation databases.

# Check available key types
keytypes(org.Hs.eg.db)

# Convert between ID types
converted <- bitr(genes, fromType = 'ENSEMBL', toType = 'ENTREZID', OrgDb = org.Hs.eg.db)

# Multiple output types
converted <- bitr(genes, fromType = 'SYMBOL', toType = c('ENTREZID', 'ENSEMBL'), OrgDb = org.Hs.eg.db)

With Background Universe

Goal: Improve enrichment specificity by restricting the background to genes actually tested in the experiment.

Approach: Pass all expressed genes (not just significant ones) as the universe parameter to enrichGO.

# Use all expressed genes as background (recommended)
all_genes <- de_results$gene_id
universe_ids <- bitr(all_genes, fromType = 'SYMBOL', toType = 'ENTREZID', OrgDb = org.Hs.eg.db)

ego <- enrichGO(
    gene = gene_list,
    universe = universe_ids$ENTREZID,  # Background gene set
    OrgDb = org.Hs.eg.db,
    keyType = 'ENTREZID',
    ont = 'BP',
    pAdjustMethod = 'BH',
    pvalueCutoff = 0.05
)

All Three Ontologies

# Run all ontologies at once
ego_all <- enrichGO(
    gene = gene_list,
    OrgDb = org.Hs.eg.db,
    keyType = 'ENTREZID',
    ont = 'ALL',  # BP, MF, and CC combined
    pAdjustMethod = 'BH',
    pvalueCutoff = 0.05
)

# Results include ONTOLOGY column
head(as.data.frame(ego_all))

Make Results Readable

# Convert Entrez IDs to gene symbols in results
ego_readable <- setReadable(ego, OrgDb = org.Hs.eg.db, keyType = 'ENTREZID')

# Or use readable = TRUE directly (only works with ENTREZID input)
ego <- enrichGO(
    gene = gene_list,
    OrgDb = org.Hs.eg.db,
    keyType = 'ENTREZID',
    ont = 'BP',
    readable = TRUE  # Converts to symbols
)

Extract and Export Results

# View top results
head(ego)

# Convert to data frame
results_df <- as.data.frame(ego)

# Key columns: ID, Description, GeneRatio, BgRatio, pvalue, p.adjust, qvalue, geneID, Count

# Export to CSV
write.csv(results_df, 'go_enrichment_results.csv', row.names = FALSE)

# Filter for specific criteria
sig_terms <- results_df[results_df$p.adjust < 0.01 & results_df$Count >= 5, ]

Simplify Redundant Terms

Goal: Remove highly similar GO terms to reduce redundancy in enrichment results.

Approach: Cluster GO terms by semantic similarity and retain representative terms using the simplify function.

# Remove redundant GO terms (keeps representative terms)
ego_simplified <- simplify(ego, cutoff = 0.7, by = 'p.adjust', select_fun = min)

Different Organisms

# Mouse
library(org.Mm.eg.db)
ego_mouse <- enrichGO(gene = genes, OrgDb = org.Mm.eg.db, ont = 'BP')

# Zebrafish
library(org.Dr.eg.db)
ego_zfish <- enrichGO(gene = genes, OrgDb = org.Dr.eg.db, ont = 'BP')

# Yeast
library(org.Sc.sgd.db)
ego_yeast <- enrichGO(gene = genes, OrgDb = org.Sc.sgd.db, ont = 'BP', keyType = 'ORF')

Group GO Terms by Ancestor

Goal: Classify genes by broad GO slim categories for a high-level functional overview.

Approach: Use groupGO to assign genes to GO terms at a specific hierarchy level.

# Classify genes by GO slim categories
ggo <- groupGO(
    gene = gene_list,
    OrgDb = org.Hs.eg.db,
    ont = 'BP',
    level = 3,  # GO hierarchy level
    readable = TRUE
)

Key Parameters

Parameter	Default	Description
gene	required	Vector of gene IDs
OrgDb	required	Organism database
keyType	ENTREZID	Input ID type
ont	BP	BP, MF, CC, or ALL
pvalueCutoff	0.05	P-value threshold
qvalueCutoff	0.2	Q-value (FDR) threshold
pAdjustMethod	BH	BH, bonferroni, etc.
universe	NULL	Background genes
minGSSize	10	Min genes per term
maxGSSize	500	Max genes per term
readable	FALSE	Convert to symbols

Related Skills

kegg-pathways - KEGG pathway enrichment
gsea - Gene Set Enrichment Analysis for GO
enrichment-visualization - Visualize enrichment results
differential-expression - Generate input gene lists

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version Compatibility

GO Over-Representation Analysis

Core Pattern

Prepare Gene List from DE Results

ID Conversion with bitr

With Background Universe

All Three Ontologies

Make Results Readable

Extract and Export Results

Simplify Redundant Terms

Different Organisms

Group GO Terms by Ancestor

Key Parameters

Related Skills

FilesExpand file tree

SKILL.md

Latest commit

History

SKILL.md

File metadata and controls

Version Compatibility

GO Over-Representation Analysis

Core Pattern

Prepare Gene List from DE Results

ID Conversion with bitr

With Background Universe

All Three Ontologies

Make Results Readable

Extract and Export Results

Simplify Redundant Terms

Different Organisms

Group GO Terms by Ancestor

Key Parameters

Related Skills