This skill enables AI agents to help you analyze codon usage patterns in coding sequences using Biopython. It covers codon counting, CAI calculation, RSCU analysis, and codon optimization for heterologous expression.
pip install biopythonTell your AI agent what you want to do:
- "Calculate codon frequencies for this coding sequence"
- "What is the CAI of this gene for E. coli expression?"
- "Find rare codons in my sequence"
- "Optimize this gene for yeast expression"
"Count the codons in this coding sequence and show frequencies"
"Calculate the Codon Adaptation Index for this gene using E. coli codon usage"
"Calculate RSCU values for my coding sequence"
"Find codons that are used less than 10% of the time"
"Replace rare codons with preferred synonymous codons for E. coli"
"Compare codon usage between these two genes"
"What is the GC content at each codon position?"
- Import Bio.SeqUtils.CodonUsage and related modules
- Parse the coding sequence
- Calculate requested metrics (CAI, RSCU, frequencies)
- Return formatted analysis results
- Suggest optimizations if requested
- Measures how well-adapted a gene's codon usage is to a host organism
- Range: 0-1 (higher = better adapted)
- Requires a reference set of highly expressed genes
- Measures bias toward specific synonymous codons
- 1.0 = no bias (all synonymous codons used equally)
- >1 = overused, <1 = underused
- Measures overall codon bias
- Range: 20-61
- Lower values = more biased usage
- Ensure sequences are in frame (length divisible by 3)
- Stop codons are usually excluded from analysis
- CAI requires training on reference genes from target organism
- Consider organism-specific codon tables for non-standard genetic codes
- GC3 (GC at wobble position) correlates with overall genome GC content