-
Notifications
You must be signed in to change notification settings - Fork 19
Description
@andrewrobertjones @timosachsenberg @jgriss :
We have an ongoing project to map the Genome coordinates into ESENBML, we have been doing this for a while. Internally, PRIDE has moved into mztab long time ago. Then, our PSMs are in mztab for every project. We have a tool that read the mztab and tries to map the PSMs into Reference Genomes. However, we would like to keep that information also into the mzTab files as we did it in the mzIdentML 1.2. This is really important to us because we want to annotate our datasets.
I was checking the current implementation of mzid 1.2 this information is represented in the PeptideEvidence objects like:
<PeptideEvidence dBSequence_ref="dbseq_generic|A_ENSP00000471242.1|" peptide_ref="LALWEGR_" start="606" end="612" pre="R" post="S" isDecoy="false" id="LALWEGR_generic|A_ENSP00000471242.1|_606_612">
<userParam name="psm_count" value="1"></userParam>
<cvParam cvRef="PSI-MS" accession="MS:1002640" name="peptide end on chromosome" value="98424581"></cvParam>
<cvParam cvRef="PSI-MS" accession="MS:1002641" name="peptide exon count" value="2"></cvParam>
<cvParam cvRef="PSI-MS" accession="MS:1002642" name="peptide exon nucleotide sizes" value="11,10"></cvParam>
<cvParam cvRef="PSI-MS" accession="MS:1002643" name="peptide start positions on chromosome" value="98412025,98424571"></cvParam>
</PeptideEvidence>
<PeptideEvidence dBSequence_ref="dbseq_generic|A_ENSP00000479861.1|" peptide_ref="GRLYPWGVVEVENPEHNDFLK_" start="290" end="310" pre="R" post="L" isDecoy="false" id="GRLYPWGVVEVENPEHNDFLK_generic|A_ENSP00000479861.1|_290_310">
<userParam name="psm_count" value="2"></userParam>
<cvParam cvRef="PSI-MS" accession="MS:1002640" name="peptide end on chromosome" value="241343880"></cvParam>
<cvParam cvRef="PSI-MS" accession="MS:1002641" name="peptide exon count" value="1"></cvParam>
<cvParam cvRef="PSI-MS" accession="MS:1002642" name="peptide exon nucleotide sizes" value="63"></cvParam>
<cvParam cvRef="PSI-MS" accession="MS:1002643" name="peptide start positions on chromosome" value="241343819"></cvParam>
</PeptideEvidence>I like to reuse the Cvparam style used in mzid but we don't have in the mzTab the peptideEvidence concept. Then, this annotation should be added into the PSM section using optional cvparameters. With optional parameters, we don't need to change the schema of mztab. The problem is that because they are PSMs, they can map to multiple genome coordinates. Suggestions?