EvoFATE

EvoFATE is a computational toolkit for analyzing high-throughput long-read single-cell RNA sequencing (LR-scRNAseq) data to jointly reconstruct Evolutionary (point mutation) and Fate (gene expression) trajectories in individual cells.

Installation

# Install from source
pip install -e .

# Or install with development dependencies
pip install -e ".[dev]"

Overview

EvoFATE includes three major steps:

Genetic Graph Constructor (evofate.genetic)
- Builds genotype graphs from single-cell mutation data using Node2Vec
- Captures relationships among cells based on shared mutations
Evolutionary Lineage Tracer (evofate.utils)
- Calculates genetic timing
- Infers clone lineage layouts
- Projects cells in an evolution-informed embedding space
EvoFATE Integrator (evofate.integration)
- Integrates transcriptomic profiles with genetic structure using BGRL with GAT backbone
- Performs co-projection of modalities
- Computes EvoFATE time for ordered trajectory reconstruction

Input Data Format

EvoFATE requires two main input datasets stored as AnnData objects:

1. Mutation Data (`adata_mut`)

The mutation profile matrix should be stored in .X of the AnnData object.

Format:

Rows: Individual cells
Columns: Mutation sites/positions
Data type: numpy.ndarray or scipy.sparse.spmatrix
Shape: (n_cells, n_mutations)

Encoding:

1: Mutant (MT) - mutation present in the cell
-1: Wildtype (WT) - reference allele, no mutation
0: Missing data - site not covered or uncertain

Requirements:

Cell names should be stored in .obs_names
Mutation/site names should be stored in .var_names
Missing data should remain as 0 (do not impute)

Example:

import numpy as np
import anndata as ad
import pandas as pd

# Create mutation matrix
n_cells = 1000
n_mutations = 500
mutation_matrix = np.random.choice([1, -1, 0], size=(n_cells, n_mutations), 
                                   p=[0.1, 0.8, 0.1])

# Create AnnData object
cell_names = [f"Cell_{i}" for i in range(n_cells)]
mutation_names = [f"Mut_{j}" for j in range(n_mutations)]

adata_mut = ad.AnnData(
    X=mutation_matrix,
    obs=pd.DataFrame(index=cell_names),
    var=pd.DataFrame(index=mutation_names)
)

2. Expression Data (`adata_exp`)

Standard single-cell RNA sequencing count matrix.

Format:

Rows: Individual cells (must match mutation data cell order)
Columns: Genes
Data type: Count matrix (typically sparse)
Shape: (n_cells, n_genes)

Important Notes:

Both datasets must have matching cell identifiers/barcodes for proper integration
Mutation calls should come from high-quality variant calling or long-read sequencing
Expression data should be from the same cells as mutation data
Missing mutation data (0) should not be imputed

API Reference

Genetic Module

cal_genetic_embedding(adata_mut, ...): Calculate genetic embeddings using Node2Vec

Integration Module

cal_evofate_embedding(adata_mut, ...): Calculate EvoFATE embeddings using BGRL
BGRL: Bootstrapped Graph Representation Learning model class

Utility Functions

Clone Analysis:

define_clones(adata_mut, ...): Define clonal populations from mutation data
cal_timing(adata_mut, key): Calculate evolutionary timing
cal_clone_connectivity(adata_mut): Calculate clone connectivity graph
cal_tree_layout(adata_mut, ...): Calculate lineage tree layout

Projections:

cal_linear_projection(adata_mut, key): Linear projection onto tree coordinates
cal_guided_umap_projection(adata_mut, ...): UMAP projection guided by timing
CCA_projection(feature_matrix, x): CCA-based projection
guided_residual_projection(X_2d, y): Guided residual projection

Visualization:

plot_consensus_profile(adata_mut, ...): Plot consensus mutation profiles
plot_lineage_tree(adata_mut, ...): Plot lineage tree
plot_lineage_tree_w_piechart(adata_mut, label, ...): Plot tree with pie charts
plot_embedding(adata_mut, basis, labels, ...): Plot embeddings

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
evofate.egg-info		evofate.egg-info
evofate		evofate
example		example
.DS_Store		.DS_Store
.gitattributes		.gitattributes
Copyright.txt		Copyright.txt
Guided_tutorial.ipynb		Guided_tutorial.ipynb
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EvoFATE

Installation

Overview

Input Data Format

1. Mutation Data (`adata_mut`)

2. Expression Data (`adata_exp`)

API Reference

Genetic Module

Integration Module

Utility Functions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

gaolabtools/EvoFATE

Folders and files

Latest commit

History

Repository files navigation

EvoFATE

Installation

Overview

Input Data Format

1. Mutation Data (adata_mut)

2. Expression Data (adata_exp)

API Reference

Genetic Module

Integration Module

Utility Functions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

1. Mutation Data (`adata_mut`)

2. Expression Data (`adata_exp`)

Packages