Skip to content

Benchmarking embedding methods (UMAP, VAE, PCA, FA, ICA, etc.) for survival prediction on omics data with TabNet, CatBoost and ridge models.

License

Notifications You must be signed in to change notification settings

KonNik88/omics-survival-embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Omics Survival Embeddings

Python Jupyter PyTorch TabNet License: MIT

Benchmarking different embedding methods for survival prediction on omics data.

Overview

This project explores how various dimensionality reduction and embedding techniques affect survival analysis on omics datasets.
We compared classical linear methods with nonlinear and deep-learning approaches and combined them with modern predictive models.

Methods

  • Embeddings: PCA, ICA, Factor Analysis, UMAP, VAE, AutoEncoder, MOFA, NMF, CP/Tucker Tensor, Scikit-Fusion
  • Models: TabNet, CatBoost, Ridge Regression
  • Evaluation metrics:
    • Concordance index (C-index)
    • Mean Absolute Percentage Error (MAPE)

Key Results

  • Best combination: UMAP + TabNet (C-index = 0.568, MAPE = 79.26) without clinical features.
  • VAE + TabNet (with clinical features) also performed well (C-index = 0.541, MAPE = 75.71).
  • TabNet consistently outperformed other models across embeddings.
  • Clinical features sometimes improved performance, but not universally.
  • PLS + Ridge minimized MAPE but had poor C-index → not suitable for survival tasks.

Installation

git clone https://github.com/KonNik88/omics-survival-embeddings.git
cd omics-survival-embeddings
pip install -r requirements.txt

## Usage

Open the Jupyter notebooks in notebooks/ to reproduce experiments and visualizations.
All embeddings and models are implemented with scikit-learn, PyTorch, and PyTabNet.

## Visualizations

We provide comparisons of embedding spaces (UMAP, AE, VAE) and model performance metrics in results/.

About

Benchmarking embedding methods (UMAP, VAE, PCA, FA, ICA, etc.) for survival prediction on omics data with TabNet, CatBoost and ridge models.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published