This repository contains code for aptamer design in silico developed by the DTU Biobuilders 2023 iGEM team. The AptaLoop pipeline consists of 4 different modules:
1a. Secondary/tertiary structure prediction
1b. Making Aptamers Without Selex (MAWS)
2. Docking
3. Molecular Dynamics
We decided to use a Jupyter Notebook format to make sure that our code is well documented and easy to use for external people, and we encapsulated the pipeline in a docker container to ensure reproducibility. The global structure of the directory is as follows (some files have been skipped for the sake of comprehension):
dtu-denmark
├── data
├── example
│ ├── 1a_sequence_3d
│ │ ├── 1_create_sequence_file.ipynb
│ │ └── 2_aptamer_folding_3dDNA.ipynb
│ ├── 1b_maws
│ │ └── maws.ipynb
│ ├── 2_docking
│ │ └── docking.ipynb
│ └── 3_molecular_dynamics
│ └── molecular_dynamics.ipynb
├── heidelberg_maws
│ └── MAWS2023.py
├── notebooks
│ ├── 1a_sequence_3d
│ │ ├── 1_create_sequence_file.ipynb
│ │ └── 2_aptamer_folding_3dDNA.ipynb
│ ├── 1b_maws
│ │ └── maws.ipynb
│ ├── 2_docking
│ │ └── docking.ipynb
│ └── 3_molecular_dynamics
│ └── molecular_dynamics.ipynb
├── Dockerfile
├── LICENSE
├── README.md
└── requirements.txt
The global aim of this pipeline is to provide the necessary tools to design DNA or RNA aptamers to target a specific molecule (protein, organic or lipid), predict the secondary and tertiary structure, predict the interaction position between them (docking) and simulate the molecular dynamics. The results of these analyses can be useful to guide the wet lab efforts in finding the best possible aptamer sequence for the desired target molecule.
The first module is meant for those users that already have an aptamer sequence for their target molecule and want to evaluate it using our software. It consists of 2 parts:
- The first notebook will generate a PDB file from a DNA or RNA aptamer sequence.
- The second notebook will predict the secondary and tertiary structures of the provided aptamer sequence.
The second module is meant for those users that wish to create an aptamer from scratch to find the best possible sequence for their target molecule. It is possible to define the type of aptamer (DNA or RNA), the type of ligand molecule (protein, organic or lipid), the number of nucleotides that the aptamer should have, and others (see the notebook).
The third module takes care of predicting the interaction location of the aptamer-molecule complex, and providing an insight of the predicted binding affinity. The results are multiple poses ranked by their predicted binding energy, where the lowest energy pose is considered the most favorable binding mode.
The fourth and last module simulates the molecular dynamics of the interaction between the aptamer and the ligand molecule, which concludes our pipeline by giving an insight on the binding between the two.
The easiest way to install the required dependencies is through Docker. Docker is a platform for developing, shipping, and running applications inside containers. It enables consistent deployment across different systems, simplifying application management and ensuring compatibility.
So first, you need to install Docker Desktop from the official Docker website: Docker Desktop. Select the version depending on your Operating System and follow the instructions to complete the installation. Then, open the application.
To verify that Docker is installed and running, open a terminal or command prompt and run the following command: docker --version
If the output you get is Docker version X.Y.Z (where X, Y and Z are numbers), then it means you successfully installed Docker and you are ready to install the dtu-biobuilders-aptaloop image that will enable you to run our pipeline.
Follow these steps in the terminal:
-
Pull the docker image:
docker pull teheavy/dtu-biobuilders-aptaloop:production -
Then check if you pulled the image correctly:
docker images -
If you see teheavy/dtu-biobuilders-aptaloop:production, it means everything went well. Do the following to run the container:
sudo docker run -it -p 8888:8888 teheavy/dtu-biobuilders-aptaloop:production -
If everything was as expected, you should be located at
/dtu-denmark. Run the following to start a jupyter lab server:jupyter lab --ip=0.0.0.0 -
Finally, you have to copy-paste one of the 2 URLs where you hosted the jupyter lab server (usually the second one works and the first one does not) into your favourite browser. Once you are there, move to the
notebooksdirectory, where you have all the notebooks to perform the analysis. When running the notebooks, remember to choose the kernel namedPython (AptaLoop). You can do that by clicking to the top right of the notebook, where it saysPython 3 (ipykernel)and selectingPython (AptaLoop). And that's it, happy coding!
This guide will walk you through the process of running a JupyterLab server in a Docker container using Docker Compose.
Ensure you have Docker and Docker Compose installed on your machine. If not, you can download and install them from the following links:
-
Clone the Repository
First, if you have a Git repository containing the necessary Dockerfile and
docker-compose.yml, clone it:bash git clone https://gitlab.igem.org/2023/software-tools/dtu-denmark.git cd dtu-denmark -
Prepare the Data Directory
Create a data directory that will be used to store Jupyter notebooks and any data you work with inside the container. This step ensures that your data persists between container restarts.
mkdir data -
Build and Run with Docker Compose
Use Docker Compose to build the image (if it hasn't been built) and start the service:
docker-compose up --buildThe
--buildflag ensures that the Docker image is built with the latest changes in your Dockerfile. After the build, Docker Compose will start the JupyterLab server. -
Access JupyterLab
Open your web browser and go to
http://localhost:8888. You should now see the JupyterLab interface. -
Stopping the Server
To stop the JupyterLab server, press CTRL+C in the terminal where Docker Compose is running. To remove the containers completely, run:
docker-compose down
Ensure that the port 8888 is not being used by another application on your host machine.
For changes in the Dockerfile or the docker-compose.yml, you may need to rebuild the image using docker-compose up --build.
It is very important that the user selects the kernel Python (AptaLoop) when running the notebooks, as it contains the packages needed to run them. We provide an example on how to run all the Jupyter notebooks under the example directory. However, be aware that the example is just meant to show how to use the NB and what are the inputs/outputs the user should be expecting. The example is NOT designed to produce a scientifically correct or meaningful result. See a detailed description of each NB below.
As mentioned before, the AptaLoop pipeline has 4 modules. The modules are meant to be run sequentially if the user wants to make a thorough analysis of the aptamer-molecule complex interaction and dynamics, but it is also possible to run them individually. Module 1a should be run if the user already has an aptamer sequence they would like to test, and Module 1b should be run if the user wants to create the aptamer sequence from scratch. Each module usage is described below:
- Run the NB 1_create_sequence_file.ipynb to create a sequence PDB file.
- Input: DNA or RNA string.
- Output: PDB file.
- Run the NB 2_aptamer_folding_3dDNA.ipynb to get the secondary and tertiary structure predictions for the given aptamer sequence.
- Input: FASTA file of aptamer sequence(s).
- Output: FASTA file of secondary structure prediction(s) and tertiary structure prediction(s).
Run the NB maws.ipynb to generate the DNA or RNA aptamer that best binds your target molecule.
- Input: PDB file of target molecule.
- Output: PDB file of aptamer + target molecule.
Run the NB docking.ipynb to perform a docking simulation between your aptamer and ligand molecule.
- Input:
- PDB file for the aptamer, located in the "data" directory.
- SDF or PDB file for the ligand, located in the "data" directory.
- It is necessary to specify the grid parameters (x_c, y_c, z_c, x_s,y_s, z_s), inside the notebook
- Output: PDBQT file.
Run the NB molecular_dynamics.ipynb to perform a molecular dynamics simulation of the aptamer-molecule complex.
- Input:
- PDB file containing the aptamer-molecule complex.
- GROMACS parameter and configuration files (ions.mdp, minim.mdp, nvt.mdp, npt.mdp, md.mdp).
- Output:
- Processed and solvated molecular structures in GROMACS formats.
- Energy, temperature, pressure, and density profiles during the simulation.
- Trajectory and analysis files including RMSD, gyration, and more.
We are open to contributions, as long as our work is attributed properly.
- We want to acknowledge the authors of the original version of MAWS, the Heidelberg 2015 iGEM team: Wiki and GitHub.
- We also want to thank the Heidelberg 2017 iGEM team, for making the first improvements to the MAWS software: Wiki and GitHub.
- Finally, we would like to thank the NU Kazakhstan 2022 iGEM team for making further improvements to the code and developing a guide on how to use MAWS: Wiki, GitHub and Guide.
Abraham, M.J., Murtola, T., Schulz, R., Páll, S., Smith, J.C., Hess, B., and Lindahl, E. “GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers,” SoftwareX, 1–2 19–25 (2015).
Eberhardt, J., Santos-Martins, D., Tillack, A.F., Forli, S. (2021). AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings. Journal of Chemical Information and Modeling.
Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2.
Salomon-Ferrer, R., Case, D.A., Walker, R.C. (2013) "An overview of the Amber biomolecular simulation package." WIREs Comput. Mol. Sci. 3, 198-210.
Trott, O., & Olson, A. J. (2010). AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of computational chemistry, 31(2), 455-461.