Identifies beta turns in proteins according to Shapovalov, Vucetic, and Dunbrack, PLOSCompBio 2019. https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006844
This is a fresh and complete rewrite in python3 that replaces the python2 code we published in 2019.
Requires mkdssp from DSSP4.5 : https://github.com/PDB-REDO/dssp. It uses the command line version of DSSP4.5 not the python module. The way to install it in /usr/local/bin/mkdssp is:
git clone https://github.com/PDB-REDO/dssp.git
cd dssp
cmake -S . -B build
cmake --build build
cmake --install build
If you install mkdssp somewhere else, change the code in Identify_18_BetaTurn_Types.py here:
def run_dssp(input_pdb_or_cif, dssp_executable="/usr/local/bin/mkdssp"):
To make mkdssp work on AlphaFold files, you might need to install these files manually (including making the directory /var/cache/libcifpp/):
curl -o /var/cache/libcifpp/components.cif https://files.wwpdb.org/pub/pdb/data/monomers/components.cif
curl -o /var/cache/libcifpp/mmcif_pdbx.dic https://mmcif.wwpdb.org/dictionaries/ascii/mmcif_pdbx_v50.dic
curl -o /var/cache/libcifpp/mmcif_ma.dic https://raw.githubusercontent.com/ihmwg/ModelCIF/master/dist/mmcif_ma.dic
Then install Identify_18_BetaTurn_Types by downloading the git zip file. Put it anywhere on your computer (let's call it "path_to_script/". The code is all in one file and does not require any paths or files other than the path the python script itself.
python3 path_to_script/Identify_18_BetaTurn_Types.py filename.cif > outfilename
python3 path_to_script/Identify_18_BetaTurn_Types.py filename.pdb > outfilename
python3 path_to_script/Identify_18_BetaTurn_Types.py 3e5a.cif
turn num chn res1 res4 seq dssp type prev_name Dist DistAng CA1-CA4 omega2 phi2 psi2 omega3 phi3 psi3 omega4 filename
turn 1 A 129 132 ALED CGGG AD I 0.1491 22.26 5.59 177.38 -49.71 -46.55 176.06 -46.67 -13.00 178.89 3e5a
turn 2 A 130 133 LEDF GGGE AD I 0.0383 11.23 5.71 176.06 -46.67 -13.00 178.89 -82.81 -9.11 -177.51 3e5a
turn 3 A 141 144 KGKF SGGG pD II' 0.0670 14.88 5.57 170.63 54.60 -132.45 179.20 -57.92 -15.22 176.87 3e5a
turn 4 A 142 145 GKFG GGGT AD I 0.0440 12.05 5.23 179.20 -57.92 -15.22 176.87 -103.69 10.00 -173.87 3e5a
turn 5 A 144 147 FGNV GTTE AD I 0.0549 13.45 5.92 -173.87 -62.87 0.73 -156.65 -102.92 -12.84 -164.28 3e5a
turn 6 A 152 155 EKQS ETTT AD I 0.0470 12.44 5.82 -166.98 -58.12 -32.18 -171.95 -91.98 -46.88 179.84 3e5a
turn 7 A 153 156 KQSK TTTC AD I 0.0733 15.56 5.38 -171.95 -91.98 -46.88 179.84 -79.56 -23.87 -173.57 3e5a
turn 8 A 190 193 HPNI CTTB AD I 0.1066 18.79 5.26 176.32 -51.59 -36.82 -178.19 -105.12 27.26 176.33 3e5a
turn 9 A 202 205 DATR CSSE AD I 0.0405 11.56 5.34 -175.29 -53.82 -31.56 -170.02 -123.16 -24.81 -171.69 3e5a
turn 10 A 213 216 APLG CTTC AD I 0.0317 10.21 6.16 -168.14 -69.94 -12.27 -174.51 -98.76 3.43 172.87 3e5a
turn 11 A 248 251 HSKR HTTT AD I 0.0302 9.97 4.79 172.32 -61.66 -10.94 -176.61 -109.75 -2.38 -179.14 3e5a
turn 12 A 254 257 HRDI CCCC dD new 0.1841 24.77 6.82 166.89 78.66 11.03 171.83 -151.58 44.81 176.55 3e5a
turn 13 A 258 261 KPEN SGGG AD I 0.0531 13.24 5.58 -168.83 -51.82 -34.16 179.70 -71.36 -29.86 178.93 3e5a
turn 14 A 259 262 PENL GGGE AD I 0.0570 13.71 5.39 179.70 -71.36 -29.86 178.93 -80.98 11.92 -179.12 3e5a
turn 15 A 265 268 GSAG CTTS AD I 0.1287 20.67 5.11 176.88 -32.72 -55.49 -172.09 -90.42 9.97 -177.37 3e5a
turn 16 A 275 278 FGWS CTTC AD I 0.0393 11.38 4.99 179.12 -57.36 -21.14 174.95 -116.79 2.26 -176.09 3e5a
turn 17 A 281 284 APSS CSSS AD I 0.1712 23.88 6.09 -170.22 -78.63 -32.65 -179.43 -132.21 -70.39 -178.60 3e5a
turn 18 A 292 295 TLDY CGGG AD I 0.1458 22.02 5.40 -175.45 -28.17 -59.24 -168.10 -74.97 -7.03 -173.65 3e5a
turn 19 A 293 296 LDYL GGGC AD I 0.0386 11.27 5.61 -168.10 -74.97 -7.03 -173.65 -113.31 -2.33 -177.08 3e5a
turn 20 A 307 310 DEKV CTTH AD I 0.0469 12.44 6.42 -179.02 -61.71 -14.04 -178.29 -69.54 -14.54 170.42 3e5a
turn 21 A 327 330 PPFE CTTC AZ new_prev_VIII 0.0788 16.14 6.03 -171.03 -72.30 -19.14 -174.07 -114.37 23.77 -178.54 3e5a
turn 22 A 331 334 ANTY CSSH AB1 new_prev_VIII 0.0414 11.68 6.77 177.27 -69.86 -46.55 -176.60 -112.91 147.79 173.15 3e5a
turn 23 A 349 352 PDFV CTTS AD I 0.0482 12.60 5.78 -175.41 -52.09 -33.03 -177.37 -69.67 -15.52 -175.53 3e5a
turn 24 A 365 368 KHNP CSSG AB2 VIII 0.0704 15.25 6.35 177.12 -55.50 -44.16 178.27 -84.08 115.85 -176.81 3e5a
turn 25 A 367 370 NPSQ SGGG AD I 0.0815 16.41 5.35 -176.81 -62.49 -34.82 175.12 -57.76 -26.21 -179.37 3e5a
turn 26 A 368 371 PSQR GGGS AD I 0.0231 8.71 5.33 175.12 -57.76 -26.21 -179.37 -83.35 -8.55 -172.47 3e5a
turn 27 B 18 21 NFSS CTTC AG new_prev_VIII 0.2072 26.31 6.56 -170.89 -73.24 -26.65 179.65 -83.56 32.22 126.36 3e5a
The output gives
- the residues of each beta turn (res1-res4)
- the sequence of the 4-residue turn
- the dssp assignment of the turn ("C" is coil when DSSP does not report a secondary structure letter)
- the new turn type ("type", e.g. "AD", "Pa", "Pd", etc.)
- the classical turn type ("prev_name", e.g. "I", "II"; "new_prev_VIII" indicates it is a new turn type but would formerly have been close to a type VIII turn).
- the distance in our metric ("Dist"), which the average of D=2(1-cos(d_theta)), where theta are the angles given on each line: omega2, phi2, psi2, omega3, phi3, psi3, omega4, which connect CA of the first residue to CA of the 4th residue of each turn. d_theta is the difference between the PDB dihedral angle and the medoid for that turn type, determined by the clustering described in Shapovalov et al. F
- "DistAng" is the distance in degrees, which is just the average angle distance converted back into an angle in degrees (theta = arccos(1 - D/2)).
- "CA1-CA4" distance is given next followed by all the dihedral angles
- Dihedral angles: omega2, phi2, psi2, omega3, phi3, psi3, omega4
- Filename (minus ".cif" or ".pdb")
The code also saves the mmCIF file produced by mkdssp and is named (in this example) 3e5a_dssp.cif.
mkdssp might fail on some PDB and mmCIF files.