Determine which proteins are present in a sample from identified peptides, handling shared peptides between homologous proteins through grouping and parsimony.
pip install pyopenms pandas
# CLI: ProteinProphet (TPP), EPIFANY (OpenMS)
# R alternative: BiocManager::install("MSnbase")Tell your AI agent what you want to do:
- "Group proteins by shared peptide evidence from my search results"
- "Apply parsimony to find the minimum protein set"
- "Filter to proteins with at least 2 unique peptides"
"Parse MaxQuant proteinGroups.txt and explain the protein group structure"
"Identify protein groups that share all peptide evidence (indistinguishable)"
"List proteins with unique peptides vs those identified only by shared peptides"
"Apply parsimony principle to report minimum protein set explaining all peptides"
"Run EPIFANY for probabilistic protein inference with FDR control"
"Use Occam's razor to assign shared peptides to the most likely protein"
"Filter to proteins with at least 2 unique peptides for confident identification"
"Apply 1% protein-level FDR and report the number of protein groups"
"Identify single-peptide hits and flag them for review"
"Create a protein list with gene names, unique peptides, and coverage"
"Export protein groups in mzIdentML format"
"Summarize how many proteins are identified at each evidence level"
- Load peptide-protein mappings from search results
- Build protein groups based on shared peptides
- Apply inference method (parsimony, probabilistic)
- Calculate protein-level FDR
- Filter by unique peptides and FDR threshold
- Generate protein list with evidence summary
| Method | Description |
|---|---|
| Parsimony | Minimum protein set explaining all peptides |
| Occam's Razor | Assign shared peptides to protein with most evidence |
| Probabilistic | ProteinProphet, EPIFANY - probability-based |
| All peptides | Report all possible proteins (most inclusive) |
- Unique peptides: Map to only one protein (strongest evidence)
- Razor peptides: Shared peptides assigned to winning protein
- Protein groups: Proteins with indistinguishable evidence
- Require >= 2 unique peptides for confident identification
- Protein FDR is separate from peptide FDR (typically 1-5%)
- Report protein groups, not just lead proteins
- Be cautious with single-peptide identifications
- Document which inference method was used