Skip to content

maxall41/RustSASA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RustSASA

GitHub Actions Workflow Status Crates.io Downloads (recent) Crates.io License rustc 1.85+ codecov

⚑ Ludicrously fast Rust crate for protein solvent accessible surface area (SASA) calculations - 46x faster than Biopython, 7x faster than FreeSASA. Pure Rust with Python bindings & CLI. Implements Shrake-Rupley algorithm [1].

Features:

  • πŸ¦€ Written in Pure Rust.
  • ⚑️ Ludicrously fast. 46X faster than Biopython, 14X faster than mdakit_sasa, and 7X faster than Freesasa.
  • πŸ§ͺ Full test coverage.
  • 🐍 Python support.
  • πŸ€– Command line interface.

Table of Contents

Installation

Rust πŸ¦€

cargo add rust-sasa

Python 🐍

pip install rust-sasa-python

MDAnalysis package

pip install mdsasa-bolt

Command-line interface πŸ€–

1. Install Cargo Bin Install

curl -L --proto '=https' --tlsv1.2 -sSf https://raw.githubusercontent.com/cargo-bins/cargo-binstall/main/install-from-binstall-release.sh | bash

2. Install rust-sasa

cargo binstall rust-sasa

Quick start

Using in Rust πŸ¦€

use pdbtbx::StrictnessLevel;
use rust_sasa::options::{SASAOptions, ResidueLevel};

let (mut pdb, _errors) = pdbtbx::open("./example.cif").unwrap();
let result = SASAOptions::<ResidueLevel>::new().process(&pdb);

Full documentation can be found here.

Using in Python 🐍

You can now utilize RustSasa within Python to speed up your scripts! Take a look at rust-sasa-python!

import rust_sasa_python as sasa

# Simple calculation - use convenience function
result = sasa.calculate_protein_sasa("protein.pdb")
print(f"Total SASA: {result.total:.2f}")

See full docs here.

Using CLI πŸ€–

Processing single file

rust-sasa path_to_pdb_file.pdb output.json # Also supports .xml, .pdb, and .cif!

Processing an entire directory

rust-sasa input_directory/ output_directory/ --format json # Also supports .xml, .pdb, and .cif!

Using with MDAnalysis

RustSASA can be used with MDAnalysis to calculate SASA for a protein in a trajectory. RustSASA is 17x faster than mdakit_sasa.

import MDAnalysis as mda
from mdsasa_bolt import SASAAnalysis

# Load your trajectory
u = mda.Universe("topology.pdb", "trajectory.dcd")

# Create SASA analysis
sasa_analysis = SASAAnalysis(u, select="protein")

# Run the analysis
sasa_analysis.run()

# Access results
print(f"Mean total SASA: {sasa_analysis.results.mean_total_area:.2f} Ε²")
print(f"SASA per frame: {sasa_analysis.results.total_area}")
print(f"SASA per residue: {sasa_analysis.results.residue_area}")

See the mdsasa-bolt package for more information.

Benchmarking

Results:

  • RustSasa: 8.071 s Β± 0.361 s

  • Freesasa: 54.914 s Β± 0.455 s

  • Biopython: 368.025 s Β± 51.156 s

Methodology:

We computed residue level SASA values for the entire AlphaFold E. coli proteome structure database using RustSASA, Freesasa, and Biopython. Benchmarks were run with Hyperfine with options: --warmup 3 --runs 3. All three methods ran across 8 cores on an Apple M3 Macbook with 24GB of unified memory. The RustSASA CLI was used to take advantage of profile guided optimization. GNU Parallel was used to run Freesasa and Biopython in parallel.

Validation against Freesasa

Comparing Freesasa and RustSasa on E. coli proteome

Comparing Freesasa and RustSasa on Freesasa comparison dataset

Other

License

MIT

Latest update (0.3.1)

  • ⚑️ Slightly faster due to memory allocation optimization
  • PGO Builds

Also see changelog.

Contributing

Contributions are welcome! Please feel free to submit pull requests and open issues. As this is an actively developed library, I encourage sharing your thoughts, ideas, suggestions, and feedback.

How to cite

If you use the RustSASA library in your publication please cite it. To cite this reposity scroll up to the top of this page, and then click on the "Cite this repository" button in the right hand GitHub side bar. This will give you a citation in your desired format (i.e: BiBTeX, APA).

Citations:

1: Shrake A, Rupley JA. Environment and exposure to solvent of protein atoms. Lysozyme and insulin. J Mol Biol. 1973 Sep 15;79(2):351-71. doi: 10.1016/0022-2836(73)90011-9. PMID: 4760134.

About

A Rust library for calculating the SASA/ASA for each atom in a protein. Based on the Shrake & Rupley algorithm.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •