Generate GIF and PNG visualizations of protein structures across steering interventions and evolutionary trajectories.
conda env create -f environment.yml
conda activate protein-gifsOr manually:
conda create -n protein-gifs python=3.10 -c conda-forge -y
conda activate protein-gifs
conda install -c conda-forge pymol-open-source pillow numpy font-ttf-dejavu-sans-mono -ypip install -e .sudo apt-get install xvfbprotein-gifs infofrom protein_gifs import GifTask, render_gif
task = GifTask(
structure_files=["step0.pdb", "step1.pdb", "step2.pdb"],
output_path="animation.gif",
titles=["Start", "Middle", "End"],
render_style="cartoon",
)
render_gif(task)# Lambda-sweep experiments
protein-gifs sweep \
--base-dir ./sasa_big_sweep \
--output-dir ./gifs \
--styles cartoon,cartoon_sasa \
--designs 1-50
# Evolutionary trajectories
protein-gifs evolution \
--base-dir ./PeptideEvolution \
--output-dir ./gifs \
--comparisons| Style | Representation | Coloring |
|---|---|---|
cartoon |
Ribbon | Secondary structure (helix/sheet/loop) |
cartoon_sasa |
Ribbon | SASA (blue=buried → white → red=exposed) |
surface |
Molecular surface | Secondary structure |
surface_sasa |
Molecular surface | SASA |
surface_hydro |
Molecular surface | Amino acid hydrophobicity |
default: helix=red, sheet=yellow, loop=greenpastel: helix=hotpink, sheet=cyan, loop=gray
Expected layout:
base_dir/
├── lambda_-1.0_thr_0.5_high/
│ ├── design_1.pdb
│ ├── design_2.pdb
│ └── frame_wise_sasa_scores/
│ ├── design_1.pdb.csv
│ └── design_2.pdb.csv
├── lambda_-0.95_thr_0.5_high/
│ └── ...
└── lambda_1.0_thr_0.5_high/
└── ...
from protein_gifs.collectors import SweepCollector
collector = SweepCollector(
base_dir="./sasa_big_sweep",
sasa_subdir="frame_wise_sasa_scores",
)
tasks = collector.collect_tasks(
design_nums=range(1, 51),
output_dir="./gifs",
render_styles=["cartoon", "cartoon_sasa"],
)Expected layout:
base_dir/
├── structures_apex/
│ ├── KFWKLLKKALRLWAKVL/
│ │ ├── KFWKLLKKALRLWAKVL_gen_init.cif
│ │ └── KKTRLVIKGLRIWIAKL_gen_end.cif
│ └── .../
├── structures_deep_amp/
│ └── .../
└── boltz_structures_custom_md_cmaes/
├── KFWKLLKKALRLWAKVL/
│ ├── KFWKLLKKALRLWAKVL_gen_0.cif
│ ├── KFWKLLKKALRLWAKVL_gen_1.cif
│ └── ...gen_51.cif
└── .../
from protein_gifs.collectors import EvolutionCollector
collector = EvolutionCollector(
base_dir="./PeptideEvolution",
datasets=["structures_apex", "structures_deep_amp"],
)
tasks = collector.collect_tasks(output_dir="./gifs")Structure prediction sometimes produces files with NaN coordinates. Validate and filter them:
protein-gifs validate --base-dir ./structures --output validation.jsonThen pass the log to any collector:
from protein_gifs import load_nan_set
nan_set = load_nan_set("validation.json")
tasks = collector.collect_tasks(..., nan_set=nan_set)protein_gifs/
├── core/
│ ├── task.py # GifTask and ComparisonTask dataclasses
│ ├── runner.py # render_gif() and render_comparison()
│ ├── subprocess.py # PyMOL subprocess management
│ ├── validation.py # NaN detection and filtering
│ └── workers/ # Scripts that run inside PyMOL subprocesses
│ ├── gif_worker.py
│ └── comparison_worker.py
├── collectors/
│ ├── sweep.py # Lambda-sweep file discovery
│ └── evolution.py # Generation-trajectory file discovery
└── cli.py # Command-line interface
Why subprocesses? PyMOL can segfault on malformed structures and leaks memory across many rendering calls. Each GIF is rendered in an isolated subprocess so failures don't crash the batch and memory is reclaimed between tasks.
To support a new directory layout, you have two options:
- Use GifTask directly — just list your files and call
render_gif(). - Write a custom collector — implement file discovery logic that produces
GifTaskobjects. Seecollectors/sweep.pyfor a reference.
#!/bin/bash
#SBATCH --job-name=protein-gifs
#SBATCH --output=protein-gifs_%j.out
#SBATCH --time=4:00:00
#SBATCH --mem=16G
#SBATCH --cpus-per-task=4
source ~/miniconda3/etc/profile.d/conda.sh
conda activate protein-gifs
protein-gifs sweep --base-dir ./structures --output-dir ./gifs --designs 1-50