Skip to content

gaolabtools/EvoFATE

Repository files navigation

EvoFATE

EvoFATE is a computational toolkit for analyzing high-throughput long-read single-cell RNA sequencing (LR-scRNAseq) data to jointly reconstruct Evolutionary (point mutation) and Fate (gene expression) trajectories in individual cells.

Installation

# Install from source
pip install -e .

# Or install with development dependencies
pip install -e ".[dev]"

Overview

EvoFATE includes three major steps:

  1. Genetic Graph Constructor (evofate.genetic)

    • Builds genotype graphs from single-cell mutation data using Node2Vec
    • Captures relationships among cells based on shared mutations
  2. Evolutionary Lineage Tracer (evofate.utils)

    • Calculates genetic timing
    • Infers clone lineage layouts
    • Projects cells in an evolution-informed embedding space
  3. EvoFATE Integrator (evofate.integration)

    • Integrates transcriptomic profiles with genetic structure using BGRL with GAT backbone
    • Performs co-projection of modalities
    • Computes EvoFATE time for ordered trajectory reconstruction

Input Data Format

EvoFATE requires two main input datasets stored as AnnData objects:

1. Mutation Data (adata_mut)

The mutation profile matrix should be stored in .X of the AnnData object.

Format:

  • Rows: Individual cells
  • Columns: Mutation sites/positions
  • Data type: numpy.ndarray or scipy.sparse.spmatrix
  • Shape: (n_cells, n_mutations)

Encoding:

  • 1: Mutant (MT) - mutation present in the cell
  • -1: Wildtype (WT) - reference allele, no mutation
  • 0: Missing data - site not covered or uncertain

Requirements:

  • Cell names should be stored in .obs_names
  • Mutation/site names should be stored in .var_names
  • Missing data should remain as 0 (do not impute)

Example:

import numpy as np
import anndata as ad
import pandas as pd

# Create mutation matrix
n_cells = 1000
n_mutations = 500
mutation_matrix = np.random.choice([1, -1, 0], size=(n_cells, n_mutations), 
                                   p=[0.1, 0.8, 0.1])

# Create AnnData object
cell_names = [f"Cell_{i}" for i in range(n_cells)]
mutation_names = [f"Mut_{j}" for j in range(n_mutations)]

adata_mut = ad.AnnData(
    X=mutation_matrix,
    obs=pd.DataFrame(index=cell_names),
    var=pd.DataFrame(index=mutation_names)
)

2. Expression Data (adata_exp)

Standard single-cell RNA sequencing count matrix.

Format:

  • Rows: Individual cells (must match mutation data cell order)
  • Columns: Genes
  • Data type: Count matrix (typically sparse)
  • Shape: (n_cells, n_genes)

Important Notes:

  • Both datasets must have matching cell identifiers/barcodes for proper integration
  • Mutation calls should come from high-quality variant calling or long-read sequencing
  • Expression data should be from the same cells as mutation data
  • Missing mutation data (0) should not be imputed

API Reference

Genetic Module

  • cal_genetic_embedding(adata_mut, ...): Calculate genetic embeddings using Node2Vec

Integration Module

  • cal_evofate_embedding(adata_mut, ...): Calculate EvoFATE embeddings using BGRL
  • BGRL: Bootstrapped Graph Representation Learning model class

Utility Functions

Clone Analysis:

  • define_clones(adata_mut, ...): Define clonal populations from mutation data
  • cal_timing(adata_mut, key): Calculate evolutionary timing
  • cal_clone_connectivity(adata_mut): Calculate clone connectivity graph
  • cal_tree_layout(adata_mut, ...): Calculate lineage tree layout

Projections:

  • cal_linear_projection(adata_mut, key): Linear projection onto tree coordinates
  • cal_guided_umap_projection(adata_mut, ...): UMAP projection guided by timing
  • CCA_projection(feature_matrix, x): CCA-based projection
  • guided_residual_projection(X_2d, y): Guided residual projection

Visualization:

  • plot_consensus_profile(adata_mut, ...): Plot consensus mutation profiles
  • plot_lineage_tree(adata_mut, ...): Plot lineage tree
  • plot_lineage_tree_w_piechart(adata_mut, label, ...): Plot tree with pie charts
  • plot_embedding(adata_mut, basis, labels, ...): Plot embeddings

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •