CLING

Cross-view Latent Integration via Nonparametric Gamma Shrinkage Factor Analysis - an unsupervised multi-view Bayesian factor model for integrating heterogeneous datasets with automatic factor selection.

Paper | Example | Data

Paper

Available soon.

Basic Usage

See the testing notebooks in the methods folder for detailed examples.

from CLING.cling import ClingFA

# Load your multi-view data as list of numpy arrays
# views = [view1_data, view2_data, view3_data]  # each is N x D_m

# Initialize CLING with automatic K selection
model = ClingFA.from_numpy_views(
    views=views,
    K=None,  # automatically determined
    center=True,
    init_mode="pca"
)

# Fit the model with automatic factor discovery
elbos, K_history, egamma_history = model.fit(
    max_iter=1000,
    prune_every=50,
    add_every=50,
    verbose=True
)

# Extract results
factors = model.get_factors()  # N x K latent factors
weights = model.get_weights()  # list of D_m x K loading matrices
variance_explained = model.variance_explained_per_view()

CLING Code

Folder CLING contains the implementation of CLING and ablation variants:

cling.py: Main CLING model with variational inference and automatic factor selection
cling_ablation1.py: Ablation with single-Gamma prior (CLING_MGP)
cling_ablation2.py: Ablation with ARD-style shrinkage (CLING_ARD)

Folder methods contains baseline implementations and testing notebooks:

internal_models_cling.py: Wrapper functions for running CLING
external_models_*.py: Baseline implementations (MOFA, MuVi, PCA, Tucker)
test_cling_functions.ipynb: Example notebook for CLING usage
test_*_functions.ipynb: Testing notebooks for baseline methods

Simulations

See the simulations folder.

We provide code to generate synthetic multi-view data with known ground truth factors (simulations/cling_sparsity_sim.py), varying:

Number of factors (K ∈ {1, 5, 10, 15, 20, 25})
Noise levels (σ ∈ {0.5, 0.75, ..., 2.0})
Sparsity levels (1-θ ∈ {0.65, 0.70, ..., 0.85})

The simulations/scripts folder contains code to run experiments across all scenarios and methods. Results and figures are saved in simulations/results and simulations/figures. The notebook simulations/plots.ipynb generates performance comparison plots.

Benchmarks and Real-World Datasets

The paper evaluates CLING on three biological datasets. The datasets used are freely available for download:

Evo-Devo: Developmental bulk RNA-seq across species and organs
- Reference: Cardoso-Moreira et al., Nature, 2019
- 5 views defined by organs (brain, cerebellum, heart, liver, testis) across 5 species
- Download: MEFISTO study repository
scNMT-seq: Single-cell multi-omics during mouse gastrulation
- Reference: Argelaguet et al., Nature, 2019
- 3 views: RNA expression, DNA methylation, chromatin accessibility
- 1,518 single cells across developmental stages E6.5 and E7.5
- Download: GEO accession GSE121708
GBM: The Cancer Genome Atlas Glioblastoma Multiforme dataset
- Reference: Brennan et al., Cell, 2013
- 2 views: gene expression and DNA methylation
- 360 patients
- Download: TCGA portal or cBioPortal

Performance

CLING consistently outperforms or matches established baselines (MOFA, MuVi, PCA, Tucker) across all evaluation metrics:

Factor Recovery: Accurately infers the true number of latent factors
Latent Structure: Highest Spearman correlation between true and inferred factors
Feature Selection: Best Jaccard index for identifying important features
Variance Explained: Comparable or superior to competing methods
Downstream Tasks: Superior predictive performance for biological covariates

Biological Interpretation

On GBM data, CLING identifies 54 latent factors associated with:

Gene Expression Subtypes (Proneural, Neural, Classical, Mesenchymal, G-CIMP)
Methylation Status (C1-C6 categories)
Clinical Variables (MGMT status, disease-free status, patient age)

Gene set enrichment analysis reveals factors linked to:

DNA repair pathways
G2-M checkpoint regulation
E2F and MYC target genes
Proliferative programs in tumor subtypes

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
CLING		CLING
methods		methods
simulations		simulations
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLING

Paper

Basic Usage

CLING Code

Simulations

Benchmarks and Real-World Datasets

Performance

Biological Interpretation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CLING

Paper

Basic Usage

CLING Code

Simulations

Benchmarks and Real-World Datasets

Performance

Biological Interpretation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages