This workflow builds genome-scale metabolic models (GEMs) from genomic data and uses them to predict cellular phenotypes. It covers automated reconstruction, iterative curation, flux balance analysis, and applications like gene essentiality prediction and context-specific modeling.
pip install cobra carveme memote escher pandas numpy matplotlib
conda install -c bioconda diamondOptional solvers (faster for large models):
conda install -c gurobi gurobi
conda install -c ibmdecisionoptimization cplexTell your AI agent what you want to do:
- "Build a metabolic model from my genome annotation"
- "Run FBA to predict growth rate on glucose minimal medium"
- "Find essential genes in my metabolic model"
- "Gap-fill my model to grow on M9 medium"
"Create a genome-scale metabolic model from my E. coli protein FASTA"
"Reconstruct a GEM for this Pseudomonas genome"
"Run memote QC on my metabolic model"
"Check for blocked reactions and dead-end metabolites"
"Predict growth rate and flux distribution on glucose"
"Run FVA to find flux ranges for glycolysis reactions"
"What's the maximum theoretical ethanol yield?"
"Predict essential genes in my model"
"Find synthetic lethal gene pairs"
"Build a liver-specific model using my RNA-seq data"
"Constrain the model to match measured uptake rates"
| Input | Format | Description |
|---|---|---|
| Protein sequences | FASTA | Annotated proteome for reconstruction |
| Existing model | SBML/JSON | For analysis or curation |
| Media composition | Dict/TSV | Exchange reaction bounds |
| Expression data (optional) | TSV | Gene-level TPM for context models |
- Reconstruction - Generates draft model from protein sequences using CarveMe
- Validation - Runs memote QC to identify model issues
- Curation - Gap-fills for growth, fixes dead-ends, adds missing GPRs
- FBA Analysis - Predicts optimal growth and flux distribution
- Applications - Gene essentiality, context-specific models, yield prediction
| Parameter | Default | Description |
|---|---|---|
| Memote score threshold | 50% | Minimum for usable model |
| Growth threshold | 0.01 h^-1 | Minimum viable growth |
| FVA optimality fraction | 0.9 | Allow 90% of max growth |
| Essentiality threshold | 10% WT | Below = essential |
| Expression percentile | 25th | Context model cutoff |
A well-curated model should have:
- Growth rate in realistic range (0.1-1.0 h^-1 for bacteria)
- Memote score >50% (ideally >70%)
- <100 blocked reactions
- <50 dead-end metabolites
- Essential genes overlap >70% with experimental data
- Key pathways (glycolysis, TCA, etc.) carry flux
- Start with CarveMe: Fastest and most automated reconstruction
- Gap-fill iteratively: Fix one issue at a time, re-test growth
- Validate against data: Compare predictions to experimental phenotypes
- Use commercial solvers: Gurobi/CPLEX are much faster than GLPK
- Document changes: Keep track of manual curation steps
- Constrain realistically: Set uptake bounds based on experimental data
- Test edge cases: Ensure model behaves correctly under different conditions