Skip to content

ofirshorer/scXpand

 
 

Repository files navigation

scXpand Logo

scXpand: Pan-cancer Detection of T-cell Clonal Expansion

Detect T-cell clonal expansion from single-cell RNA sequencing data without paired TCR sequencing

PreprintDocumentationInstallationQuick StartUsage GuideCitation

scXpand Datasets Overview

A framework for predicting T-cell clonal expansion from single-cell RNA sequencing data.

Manuscript in preparation - detailed methodology and benchmarks coming soon.

View full documentation for comprehensive guides and API reference.


Features

  • Multiple Model Architectures:
    • Autoencoder-based: Encoder-decoder with reconstruction and classification heads
    • MLP: Multi-layer perceptron
    • LightGBM: Gradient boosted decision trees
    • Linear Models: Logistic regression and support vector machines
  • Scalable Processing: Handles millions of cells with memory-efficient data streaming from disk during training
  • Automated Hyperparameter Optimization: Built-in Optuna integration for model tuning

Installation

For detailed installation instructions, please refer to our Installation Guide.

Published Version Install

CUDA version (NVIDIA GPU):

With pip:

pip install --upgrade scxpand-cuda --extra-index-url https://download.pytorch.org/whl/cu128

With uv:

uv pip install --upgrade scxpand-cuda --extra-index-url https://download.pytorch.org/whl/cu128 --index-strategy unsafe-best-match

CPU/Apple Silicon/Other GPUs:

With pip:

pip install --upgrade scxpand

With uv:

uv pip install --upgrade scxpand

Development Setup (Install from Source)

See the Installation Guide


Quick Start

import scxpand
# Make sure that "your_data.h5ad" includes only T cells for the results to be meaningful
# Ensure that "your_data.var_names" are provided as Ensembl IDs (as the pre-trained models were trained using this gene representation)
# Please refer to our documentation for more information

# List available pre-trained models
scxpand.list_pretrained_models()

# Run inference with automatic model download
results = scxpand.run_inference(
    model_name="pan_cancer_autoencoder",  # default model
    data_path="your_data.h5ad"
)

# Access predictions
predictions = results.predictions
if results.has_metrics:
    print(f"AUROC: {results.get_auroc():.3f}")

See our Tutorial Notebook for a complete example with data preprocessing, T-cell filtering, gene ID conversion, and model application using a real breast cancer dataset.


Documentation

Setup & Getting Started:

Using Pre-trained Models:

Training Your Own Models:

Understanding Results:

📖 Full Documentation - Complete guides, API reference, and interactive tutorials


License

This project is licensed under the MIT License – see the LICENSE file for details.


Citation

If you use scXpand in your research, please cite:

Shorer, O., Amit, R., and Yizhak, K. (2025). scXpand: Pan-cancer detection of T-cell clonal expansion from single-cell RNA sequencing without paired single-cell TCR sequencing. Preprint at bioRxiv, https://doi.org/10.1101/2025.09.14.676069.

BibTeX
@article{shorer2025scxpand,
  title={scXpand: Pan-cancer detection of T-cell clonal expansion from single-cell RNA sequencing without paired single-cell TCR sequencing},
  author={Shorer, Ofir and Amit, Ron and Yizhak, Keren},
  year={2025},
  journal={bioRxiv},
  doi={https://doi.org/10.1101/2025.09.14.676069}
}

This project was created in favor of the scientific community worldwide, with a special dedication to the cancer research community.

We hope you'll find this repository helpful, and we warmly welcome any requests or suggestions - please don't hesitate to reach out!

Visitor Map

About

scXpand: detection of T-cell clonal expansion from single-cell RNA sequencing

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.4%
  • Other 0.6%