Skip to content

pgmikhael/protgps

Repository files navigation

PROTGPS

This repository contains code for the paper Protein codes promote selective subcellular compartmentalization.

Setup

  1. Install mamba (recommended) or conda
bash Miniforge-pypy3-Linux-x86_64.sh
  1. Create environment
mamba env create -f environment.yml
  1. Activate
mamba activate protgps

PROTGPS

  1. Download model checkpoints from zenodo and extract to checkpoints/protgps.

ESM2

import torch
torch.hub.set_dir("checkpoints/esm2")
model, alphabet = torch.hub.load("facebookresearch/esm:main", "esm2_t6_8M_UR50D")

DR-BERT

from transformers import AutoModel, AutoTokenizer

checkpoint = "checkpoints/drbert"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForTokenClassification.from_pretrained(checkpoint)

Training

python scripts/dispatcher.py --config configs/protein_localization/full_prot_comp_pred.json --log_dir /path/to/logdir

Inference

To make predictions, edit and run the Predict.ipynb notebook.

Generation

To generate proteins:

cd esm/examples/lm-design
./generate_nucleolus.sh
./generate_nuclear_speckle.sh

Analysis

The Analysis script is located under notebook. Data used and generated by the script is located in the zenodo repository.

Cite

@article{
doi:10.1126/science.adq2634,
author = {Henry R. Kilgore  and Itamar Chinn  and Peter G. Mikhael  and Ilan Mitnikov  and Catherine Van Dongen  and Guy Zylberberg  and Lena Afeyan  and Salman F. Banani  and Susana Wilson-Hawken  and Tong Ihn Lee  and Regina Barzilay  and Richard A. Young },
title = {Protein codes promote selective subcellular compartmentalization},
journal = {Science},
volume = {0},
number = {0},
pages = {eadq2634},
year = {},
doi = {10.1126/science.adq2634},
URL = {https://www.science.org/doi/abs/10.1126/science.adq2634},
eprint = {https://www.science.org/doi/pdf/10.1126/science.adq2634},
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published