Skip to content

Update tutorial.md #5900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
178d52b
Update tutorial.md
Swathi266 Mar 25, 2025
34c8e91
Update tutorial.md
Swathi266 Mar 25, 2025
0ed9b88
Update tutorial.md
Swathi266 Mar 26, 2025
90a77b9
Update tutorial.md
Swathi266 Mar 26, 2025
0336d9b
Update tutorial.md
Swathi266 Mar 26, 2025
d63a1f7
Update tutorial.md
Swathi266 Mar 26, 2025
847166d
Update tutorial.md
Swathi266 Mar 26, 2025
7204014
Update tutorial.md
Swathi266 Mar 28, 2025
6584a37
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Mar 29, 2025
bcd74e5
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Mar 29, 2025
57b042b
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Apr 4, 2025
966367c
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Apr 4, 2025
f21e311
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Apr 4, 2025
51369de
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Apr 4, 2025
15c60cc
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Apr 4, 2025
c46d431
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Apr 4, 2025
8a94925
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Apr 4, 2025
4d32f08
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Apr 4, 2025
83acb85
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Apr 4, 2025
47103d6
Update topics/computational-chemistry/tutorials/cheminformatics/tutor…
Swathi266 Apr 4, 2025
ed6d717
Merge branch 'main' into patch-10
shiltemann Apr 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -82,23 +82,23 @@ You can view the contents of the downloaded PDB file by pressing the 'View data'

> <hands-on-title>Separate protein and ligand</hands-on-title>
>
> 1. {% tool [Search in textfiles (grep)](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1) %} with the following parameters:
> 1. {% tool [Search in textfiles (grep)](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/9.5+galaxy0) %} with the following parameters:
> - {% icon param-file %} *"Select lines from"*: Downloaded PDB file 'Hsp90 structure'
> - {% icon param-file %} *"that"*: `Don't match`
> - {% icon param-file %} *"Regular Expression"*: `HETATM`
> - All other parameters can be left as their defaults.
> - Rename the dataset **'Protein (PDB)'**.
>
> The result is a file with all non-protein (`HETATM`) atoms removed.
> 2. {% tool [Search in textfiles (grep)](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1) %} with the following parameters. Here, we use grep again to produce a file with only non-protein atoms.
> 2. {% tool [Search in textfiles (grep)](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/9.5+galaxy0) %} with the following parameters. Here, we use grep again to produce a file with only non-protein atoms.
> - {% icon param-file %} *"Select lines from"*: Downloaded PDB file 'Hsp90 structure'
> - {% icon param-file %} *"that"*: `Match`
> - {% icon param-file %} *"Regular Expression"*: `CT5` (the name of the ligand in the PDB file)
> - All other parameters can be left as their defaults.
> - Rename the dataset **'Ligand (PDB)'**.
>
> This produces a file which only contains ligand atoms.
> 3. {% tool [Compound conversion](toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_compound_convert/openbabel_compound_convert/3.1.1+galaxy0) %} with the following parameters:
> 3. {% tool [Compound conversion](toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_compound_convert/openbabel_compound_convert/3.1.1+galaxy1) %} - interconvert between various chemistry and molecular modeling data files with the following parameters:
> - {% icon param-file %} *"Molecular input file"*: Ligand PDB file created in step 2.
> - {% icon param-file %} *"Output format"*: `MDL MOL format (sdf, mol)`
> - {% icon param-file %} *"Add hydrogens appropriate for pH"*: `7.4`
Expand Down Expand Up @@ -128,7 +128,7 @@ We will generate our compound library by searching ChEMBL for compounds which ha

> <hands-on-title>Generate compound library</hands-on-title>
>
> 1. {% tool [Compound conversion](toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_compound_convert/openbabel_compound_convert/3.1.1+galaxy0) %} with the following parameters:
> 1. {% tool [Compound conversion](toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_compound_convert/openbabel_compound_convert/3.1.1+galaxy1) %} - interconvert between various chemistry and molecular modeling data files with the following parameters:
> - {% icon param-file %} *"Molecular input file"*: 'Ligand' PDB file
> - {% icon param-file %} *"Output format"*: `SMILES format (SMI)`
> - Leave all other options as default.
Expand Down Expand Up @@ -246,14 +246,15 @@ Further, docking requires the coordinates of a binding site to be defined. Effec
>
> 1. {% tool [Prepare receptor](toolshed.g2.bx.psu.edu/repos/bgruening/autodock_vina_prepare_receptor/prepare_receptor/1.5.7+galaxy0) %} with the following parameters:
> - {% icon param-file %} *"Select a PDB file"*: 'Protein' PDB file.
> 2. {% tool [Compound conversion](toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_compound_convert/openbabel_compound_convert/3.1.1+galaxy0) %} with the following parameters:
> - Rename to 'Prepared receptor'
> 2. {% tool [Compound conversion](toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_compound_convert/openbabel_compound_convert/3.1.1+galaxy1) %} - interconvert between various chemistry and molecular modeling data files with the following parameters:
> - {% icon param-file %} *"Molecular input file"*: 'Compound library' file.
> - {% icon param-file %} *"Output format"*: `SDF`
> - {% icon param-file %} *"Output format"*: `MDL MOL format (sdf,mol)`
> - {% icon param-file %} *"Generate 3D coordinates"*: `Yes`
> - {% icon param-file %} *"Add hydrogens appropriate for pH"*: `7.4`
> - Leave all other options unchanged.
> - Rename to 'Prepared ligands'
> 3. {% tool [Calculate the box parameters for an AutoDock Vina job](toolshed.g2.bx.psu.edu/repos/bgruening/autodock_vina_prepare_box/prepare_box/2021.03.4+galaxy0) %} with the following parameters:
> 3. {% tool [Calculate the box parameters using RDKit](toolshed.g2.bx.psu.edu/repos/bgruening/autodock_vina_prepare_box/prepare_box/2021.03.5+galaxy0) %} for an AutoDock Vina job from a ligand or pocket input file (confounding box) with the following parameters:
> - {% icon param-file %} *"Input ligand or pocket"*: `Ligand (MOL)` file.
> - {% icon param-file %} *"x-axis buffer"*: `5`
> - {% icon param-file %} *"y-axis buffer"*: `5`
Expand Down Expand Up @@ -298,8 +299,8 @@ Now that the protein and the ligand library have been correctly prepared and for

> <hands-on-title>Perform docking</hands-on-title>
>
> 1. {% tool [Docking](toolshed.g2.bx.psu.edu/repos/bgruening/autodock_vina/docking/1.1.2+galaxy0) %} with the following parameters:
> - {% icon param-file %} *"Receptor"*: 'Protein PDBQT' file.
> 1. {% tool [VINA Docking](toolshed.g2.bx.psu.edu/repos/bgruening/autodock_vina/docking/1.2.3+galaxy0) %} tool to perform protein-ligand docking with Autodock Vina with the following parameters:
> - {% icon param-file %} *"Receptor"*: 'Protein receptor' file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> - {% icon param-file %} *"Receptor"*: 'Protein receptor' file.
> - {% icon param-file %} *"Receptor"*: 'Prepared receptor' file.

To be consistent with the change above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes. My mistake. I meant the same - 'Prepared receptor'.
Thank you
Do we need to change the docking version from Galaxy version 1.1.2+galaxy0 to Galaxy Version 1.2.3+galaxy0 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, please go ahead.

> - {% icon param-file %} *"Ligands"*: 'Prepared ligands' file.
> - {% icon param-file %} *"Specify pH value for ligand protonation"*: `7.4`
> - {% icon param-file %} *"Specify parameters"*: 'Upload a config file to specify parameters'
Expand All @@ -315,13 +316,13 @@ The ChemicalToolbox contains a large number of cheminformatics tools. This secti

(This section can also be completed while waiting for the docking, which can take some time to complete.)

### Visualization
## Visualization

It can be useful to visualize the compounds generated. There is a tool available for this in Galaxy based on OpenBabel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you change ### Visualization above here to ## Visualization (one less #), then it should fix the linting error :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Let me try. Thank you soo much @shiltemann

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could. Thank you again @shiltemann

> <hands-on-title>Visualization of chemical structures</hands-on-title>
>
> 1. {% tool [Visualisation](toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_svg_depiction/openbabel_svg_depiction/3.1.1+galaxy0) %} with the following parameters:
> 1. {% tool [Visualisation](toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_svg_depiction/openbabel_svg_depiction/3.1.1+galaxy1) %} with the following parameters:
> - {% icon param-file %} *"Molecular input file"*: Compound library
> - {% icon param-file %} *"Embed molecule as CML"*: `No`
> - {% icon param-file %} *"Draw all carbon atoms"*: `No`
Expand All @@ -343,15 +344,15 @@ In this step, we will group similar molecules together. A key tool in cheminform
Before clustering, let's label each compound. To do so add a second column to the SMILES compound library containing a label for each molecule. The ```Ligand SMILES``` file is also labelled something like ```/data/dnb02/galaxy_db/files/010/406/dataset_10406067.dat``` (the exact name will vary) and we would like to give it a more useful name. When labelling is complete, we can concatenate (join together) the library file with the original SMILES file for the ligand from the PDB file.

> <hands-on-title>Calculate molecular fingerprints</hands-on-title>
> 1. {% tool [Replace](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3) %} with the following parameters:
> 1. {% tool [Replace](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/9.5+galaxy0) %} with the following parameters:
> - {% icon param-file %} *"File to process"*: `Ligand SMILES`.
> - {% icon param-file %} *"Find pattern"*: add the current label of the SMILES here. You can find it by clicking the 'view' button next to the `Ligand SMILES` dataset - it will look something like `/data/dnb02/galaxy_db/files/010/406/dataset_10406067.dat`.
> - {% icon param-file %} *"Replace with"*: `ligand`
> 2. {% tool [Concatenate datasets](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/0.1.1) %} with the following parameters:
> 2. {% tool [Concatenate datasets](toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/1.0.0) %} tail-to-head with the following parameters:
> - {% icon param-file %} *"Datasets to concatenate"*: Output of the previous step.
> - Click on **Insert Dataset** and in the new selection box which appears, select 'Compound library'.
> - Run the step and rename the output dataset 'Labelled compound library'.
> 3. {% tool [Molecule to fingerprint](toolshed.g2.bx.psu.edu/repos/bgruening/chemfp/ctb_chemfp_mol2fps/1.5) %} with the following parameters:
> 3. {% tool [Molecule to fingerprint](toolshed.g2.bx.psu.edu/repos/bgruening/chemfp/ctb_chemfp_mol2fps/1.5) %} conversion to several different fingerprint formats with the following parameters:
> - {% icon param-file %} *"Molecule file"*: 'Labelled compound library' file.
> - {% icon param-file %} *"Type of fingerprint"*: `Open Babel FP2 fingerprints`
> - Rename to 'Fingerprints'.
Expand All @@ -362,10 +363,10 @@ Taylor-Butina clustering ({% cite Butina1999 %}) provides a classification of t
![Image showing a Fingerprinting System]({% link topics/computational-chemistry/images/fingerprints.png %} "A simple fingerprinting system. Each 1 or 0 in the bitstring corresponds to the presence or absence of a particular feature in the molecule. In this case, the presence of phenyl, amine and carboxylic acid groups are encoded.")

> <hands-on-title>Cluster molecules using molecular fingerprints</hands-on-title>
> 1. {% tool [Taylor-Butina clustering](toolshed.g2.bx.psu.edu/repos/bgruening/chemfp/ctb_chemfp_butina_clustering/1.5) %} with the following parameters:
> 1. {% tool [Taylor-Butina clustering](toolshed.g2.bx.psu.edu/repos/bgruening/chemfp/ctb_chemfp_butina_clustering/1.5) %} of molecular fingerprints with the following parameters:
> - {% icon param-file %} *"Fingerprint dataset"*: 'Fingerprints' file.
> - {% icon param-file %} *"threshold"*: `0.8`
> 2. {% tool [NxN clustering](toolshed.g2.bx.psu.edu/repos/bgruening/chemfp/ctb_chemfp_nxn_clustering/1.5.1) %} with the following parameters:
> 2. {% tool [NxN clustering](toolshed.g2.bx.psu.edu/repos/bgruening/chemfp/ctb_chemfp_nxn_clustering/1.5.1) %} of molecular fingerprints with the following parameters:
> - {% icon param-file %} *"Fingerprint dataset"*: 'Fingerprints' file.
> - {% icon param-file %} *"threshold"*: `0.0`
> - {% icon param-file %} *"Format of the resulting picture"*: `SVG`
Expand All @@ -390,20 +391,20 @@ From our collection of SD-files, we first extract all stored values into tabular

> <hands-on-title>Process SD-files</hands-on-title>
>
> 1. {% tool [Extract values from an SD-file](toolshed.g2.bx.psu.edu/repos/bgruening/sdf_to_tab/sdf_to_tab/2020.03.4+galaxy0) %} with the following parameters:
> 1. {% tool [Extract values from an SD-file](toolshed.g2.bx.psu.edu/repos/bgruening/sdf_to_tab/sdf_to_tab/2020.03.4+galaxy0) %} into a tabular file using RDKit with the following parameters:
> - {% icon param-file %} *"Input SD-file"*: Collection of SD-files generated by the docking step. (Remember to select the 'collection' icon!)
> - {% icon param-file %} *"Include the property name as header"*: `Yes`
> - {% icon param-file %} *"Include SMILES as column in output"*: `Yes`
> - {% icon param-file %} *"Include molecule name as column in output"*: `Yes`
> - Leave all other paramters unchanged.
> 2. {% tool [Collapse Collection](toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/4.2) %} with the following parameters:
> 2. {% tool [Collapse Collection](toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/5.1.0) %} into single dataset in order of the collection with the following parameters:
> - {% icon param-file %} *"Collection of files to collapse into single dataset"*: Collection of tabular files generated by the previous step.
> - {% icon param-file %} *"Keep one header line"*: `Yes`
> - {% icon param-file %} *"Append File name"*: `No`
> - {% icon param-file %} *"Prepend File name"*: `No`
>
> {% snippet faqs/galaxy/tools_select_collection.md datatype="datatypes" %}
>
> 3. {% tool [Compound conversion](toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_compound_convert/openbabel_compound_convert/3.1.1+galaxy0) %} with the following parameters:
> 3. {% tool [Compound conversion](toolshed.g2.bx.psu.edu/repos/bgruening/openbabel_compound_convert/openbabel_compound_convert/3.1.1+galaxy1) %} - interconvert between various chemistry and molecular modeling data files with the following parameters:
> - {% icon param-file %} *"Molecular input file"*: choose one of the SD-files from the collection generated by the docking step.
> - {% icon param-file %} *"Output format"*: `Protein Data Bank format (pdb)`
> - {% icon param-file %} *"Split multi-molecule files into a collection"*: `Yes`
Expand Down