Skip to content

Amino acid masses #17

@lgatto

Description

@lgatto

We have amino acid data.frames in two packages:

> PTMods::aminoacids
  OneLetter ThreeLetter      FullName  AvgMass  MonoMass  H C N O S Se
-         -                             0.0000   0.00000  0 0 0 0 0  0
A         A         Ala       Alanine  71.0779  71.03711  5 3 1 1 0  0
R         R         Arg      Arginine 156.1857 156.10111 12 6 4 1 0  0
N         N         Asn    Asparagine 114.1026 114.04293  6 4 2 2 0  0
D         D         Asp Aspartic acid 115.0874 115.02694  5 4 1 3 0  0
C         C         Cys      Cysteine 103.1429 103.00919  5 3 1 1 1  0
E         E         Glu Glutamic acid 129.1140 129.04259  7 5 1 3 0  0
Q         Q         Gln     Glutamine 128.1292 128.05858  8 5 2 2 0  0
G         G         Gly       Glycine  57.0513  57.02146  3 2 1 1 0  0
 [ reached 'max' / getOption("max.print") -- omitted 15 rows ]

and

> getAminoAcids()
   AA ResidueMass Abbrev3 ImmoniumIonMass                Name Hydrophobicity Hydrophilicity SideChainMass  pK1   pK2    pI
1 peg    44.00000    <NA>              NA Polyethylene glycol             NA             NA            NA   NA    NA    NA
2   A    71.03711     Ala        44.05003             Alanine           0.62           -0.5            15 2.35  9.87  6.11
3   R   156.10111     Arg       129.11400            Arginine          -2.53            3.0           101 2.18  9.09 10.76
4   N   114.04293     Asn        87.05584          Asparagine          -0.78            0.2            58 2.18  9.09 10.76
5   D   115.02694     Asp        88.03986       Aspartic acid          -0.90            3.0            59 1.88  9.60  2.98
6   C   103.00919     Cys        76.02210            Cysteine           0.29           -1.0            47 1.71 10.78  5.02
7   E   129.04259     Glu       102.05550       Glutamic acid          -0.74            3.0            73 2.19  9.67  3.08
8   Q   128.05858     Gln       101.07150           Glutamine          -0.85            0.2            72 2.17  9.13  5.65
9   G    57.02146     Gly        30.03438             Glycine           0.48            0.0             1 2.34  9.60  6.06
 [ reached 'max' / getOption("max.print") -- omitted 14 rows ]

with clear/obvious differences, such as the columns and specific row/entres:

> aminoacids[grep("term", rownames(aminoacids)), ]
       OneLetter ThreeLetter FullName AvgMass  MonoMass H C N O S Se
N-term    N-term      N-term   N-term  1.0079  1.007825 1 0 0 0 0  0
C-term    C-term      C-term   C-term 17.0073 17.002740 1 1 0 0 0  0
> getAminoAcids()[1, ]
   AA ResidueMass Abbrev3 ImmoniumIonMass                Name Hydrophobicity Hydrophilicity SideChainMass pK1 pK2 pI
1 peg          44    <NA>              NA Polyethylene glycol             NA             NA            NA  NA  NA NA

But also:

> (x <- inner_join(aminoacids[, c(1, 4, 5)], PSMatch::getAminoAcids()[-1, c("AA", "ResidueMass")], 
                   by = c("OneLetter" = "AA")) 
   OneLetter  AvgMass  MonoMass ResidueMass)
1          A  71.0779  71.03711    71.03711
2          R 156.1857 156.10111   156.10111
3          N 114.1026 114.04293   114.04293
4          D 115.0874 115.02694   115.02694
5          C 103.1429 103.00919   103.00919
6          E 129.1140 129.04259   129.04259
7          Q 128.1292 128.05858   128.05858
8          G  57.0513  57.02146    57.02146
9          H 137.1393 137.05891   137.05891
10         I 113.1576 113.08406   113.08406
11         L 113.1576 113.08406   113.08406
12         K 128.1723 128.09496   128.09496
13         M 131.1961 131.04048   131.04049
14         F 147.1739 147.06841   147.06841
15         P  97.1152  97.05276    97.05276
16         S  87.0773  87.03203    87.03203
17         T 101.1039 101.04768   101.04768
18         W 186.2099 186.07931   186.07931
19         Y 163.1733 163.06333   163.06333
20         V  99.1311  99.06841    99.06841
21         U 150.0379 150.95363   149.03000
> x$MonoMass - x$ResidueMass
 [1]  0.000004  0.000001 -0.000003  0.000003 -0.000005  0.000003 -0.000002  0.000004  0.000002  0.000004  0.000004  0.000003 -0.000005
[14]  0.000004  0.000004 -0.000002 -0.000001  0.000003 -0.000001  0.000004  1.923633

Questions

  1. Selenocysteine (U) is quite far off. Ping @guideflandre
  2. Should we harmonise/join these? Except for point 1 above, I don't think it's necessary at this point. Ping @sgibb

There's also

> PTMods::elements
    Name    FullName    AvgMass   MonoMass
H      H    Hydrogen   1.007940   1.007825
2H    2H   Deuterium   2.014102   2.014102
Li    Li     Lithium   6.941000   7.016003
C      C      Carbon  12.010700  12.000000
13C  13C    Carbon13  13.003355  13.003355
N      N    Nitrogen  14.006700  14.003074
15N  15N  Nitrogen15  15.000109  15.000109
O      O      Oxygen  15.999400  15.994915
18O  18O    Oxygen18  17.999160  17.999160
F      F    Fluorine  18.998403  18.998403
Na    Na      Sodium  22.989770  22.989768
P      P Phosphorous  30.973761  30.973762
S      S      Sulfur  32.065000  31.972071
Cl    Cl    Chlorine  35.453000  34.968853
K      K   Potassium  39.098300  38.963707
Ca    Ca     Calcium  40.078000  39.962591
Fe    Fe        Iron  55.845000  55.934939
Ni    Ni      Nickel  58.693400  57.935346
Zn    Zn        Zinc  65.409000  63.929145
Se    Se    Selenium  78.960000  79.916520
Br    Br     Bromine  79.904000  78.918336
Ag    Ag      Silver 107.868200 106.905092
Hg    Hg     Mercury 200.590000 201.970617
Au    Au        Gold 196.966550 196.966543
I      I      Iodine 126.904470 126.904473
 [ reached 'max' / getOption("max.print") -- omitted 15 rows ]
> PSMatch::getAtomicMass()
        H         C         N         O         p 
 1.007825 12.000000 14.003074 15.994915  1.007276 

and

> PSMatch::getAtomicMass()[1:4] - elements[c("H", "C", "N", "O"), "MonoMass"]
       H        C        N        O 
-3.5e-08  0.0e+00  0.0e+00  3.7e-07 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions