Machine learning guided identification of senescence markers

This repository provides a reproducible workflow for identifying robust senescence markers from SenCat transcriptomic and proteomic data and for validating marker-based scoring in external IMR90 fibroblast datasets.

Workflow

The workflow standardizes transcriptomic and proteomic measurements, applies consistent preprocessing, and uses a cross-cell-type machine learning strategy to identify markers that remain informative across biological contexts. A refined marker set is then used to derive stable marker weights and generate sample-level senescence scores in both reference and external validation datasets, with all primary outputs written to the analysis and plotting directories.

Inputs

SenCat transcriptomic data: primary RNA-level input used for marker discovery.
SenCat proteomic data: primary protein-level input used for marker discovery.
External validation data: IMR90 fibroblast datasets used to evaluate score transferability.
Workflow configuration: centralized settings for inputs, analysis profiles, and output locations.

ML markers

Transcriptomics markers: analysis/transcriptomics.transcriptomics_loose5000f_tuned_common_features.results.csv.
Proteomics markers: analysis/proteomics.proteomics_loose5000f_tuned_common_features.results.csv.

Using ML markers for senescence scoring

You can use our senescence markers to score your data for senescence.

Prepare data

The scoring is performed on h5ad file containing normalized transcriptomics or proteomics counts. Expected h5ad structure:

adata.X: sample-by-feature expression matrix
adata.var_names: feature identifiers matching the marker IDs in the marker CSV index

If your data are not normalized, you can use normalize_counts.py script:

python workflow/scripts/data/normalize_counts.py \
    --input-h5ad INPUT_H5AD \
    --design DESIGN_FACTORS \
    --output-h5ad NORMALIZED_H5AD \
    --log logs/my_data.normalize.log \
    --log-level INFO

INPUT_H5AD specifies a path to your input h5ad file
DESIGN_FACTORS specifies design factors for DESeq2, in the format x + z or ~x+z.
NORMALIZED_H5AD specifies a path where your normalized data will be saved

Get senescence scores

python workflow/scripts/cls/marker_classifier.py \
    --markers PATH_TO_ML_MARKERS \
    --input-h5ad NORMALIZED_H5AD \
    --output-results-csv OUTPUT_CSV

PATH_TO_ML_MARKERS specifies path to ML markers. Use analysis/transcriptomics.transcriptomics_loose5000f_tuned_common_features.results.csv for transcriptomics and analysis/proteomics.proteomics_loose5000f_tuned_common_features.results.csv for proteomics
NORMALIZED_H5AD specifies path to your normalized h5ad data
OUTPUT_CSV specifies path to output csv file with per-sample score values (higher values indicate stronger similarity to the senescence-associated signature).

Notes:

marker_classifier.py applies log1p internally.
Marker matching is based on adata.var_names; non-overlapping markers are skipped automatically.

Manuscript

Anerillas, Carlos, et al. "SenCat: Cataloging human cell senescence through multiomic profiling of multiple senescent primary cell types." bioRxiv (2026): 2026-02. https://doi.org/10.64898/2026.02.05.703986

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
analysis		analysis
config		config
workflow		workflow
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine learning guided identification of senescence markers

Workflow

Inputs

ML markers

Using ML markers for senescence scoring

Prepare data

Get senescence scores

Manuscript

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Machine learning guided identification of senescence markers

Workflow

Inputs

ML markers

Using ML markers for senescence scoring

Prepare data

Get senescence scores

Manuscript

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages