Skip to content

mshahneh/ModiFinder_analysis

Repository files navigation

ModiFinder Analysis

This repository provides codes and examples for the analysis of the paper: ModiFinder: Tandem Mass Spectral Alignment Enables Structural Modification Site Localization

Mohammad Reza Zare Shahneh, Michael Strobel, Giovanni Andrea Vitale, Christian Geibel, Yasin El Abiead, Berenike C Wagner, Karl Forchhammer, Neha Garg, Allegra T Aron, Vanessa V Phelan, Daniel Petras, Mingxun Wang

Install and setup

  1. After cloning the repository, you need to add the ModiFinder module:

    git submodule update --init --recursive

  2. Install the conda enviroment, We recommend using mamba instead of conda for fast install (e.g., mamba env create -f environment.yml):

    conda env create -f environment.yml

  3. Install nextflow

  4. Activate the environment:

    conda activate modi-finder-analysis

Data

First, you need to set the directory of the data in run_config file. Then you can download the data or create it from scratch:

  • You can download the files used in this project from: Zenodo and put them in the data directory defined earlier. The final format should be similar to this:

    your_data_directory/
    ├── matches/
    ├── helpers/
    ├── SIRIUS/
    └── cfmid_exp/
    

    Please note that if you choose to download data in this manner, due to the necessity of requesting information for each individual compound in real-time, it is essential to restrict the number of concurrent processes to avoid exceeding the server's request limits.

  • You can download and create the data used in this project from scratch, by running the data_prepare_main.py:

    conda activate modi-finder-analysis
    python ./data_preparation/data_prepare_main.py
    

    Please note that the data for SIRIUS has to be dowloaded from the provided link in the previous section or use gnps2 to run the workflow.

  • You can download the random forest model and then load it using:

    import joblib
    trained_model = joblib.load(trained_model_path)
    inputs = trained_model['input']
    model = trained_model['model']
    

    given that scikit-learn==1.3.2 is installed.

Experiments

To run our experiments, you can run the following command: conda activate modi-finder-analysis python ./experiments_runners/experiments_runner.py './experiments_settings/all_experiments_settings.csv'

Results

Performance result

you can check paper_figures/performance_results.ipynb notebook for performance result illustrations.

Helpers contribution

you can check paper_figures/how_much_helpers_help.ipynb notebook for helpers contribution.

Evaluation score illustration

you can check paper_figures/evaluation_score_illustration.ipynb notebook for evaluation score illustration.

dataset stats

you cak check paper_figures/datasets.ipynb notebook for dataset stats.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •