Predictive Biomarker Modeling Framework (PBMF)

The PBMF (Publised in Cancer cell ) is an automated neural network framework based on contrastive learning. This general-purpose framework explores potential predictive biomarkers in a systematic and unbiased manner.

Under the hood, the PBMF searches for a biomarker that maximizes the benefit under treatment of interest while at the same time minimizes the effect of the control treatment.

Quick tour

The PBMF runs as follows:

from PBMF.attention.model_zoo.SimpleModel import Net
from PBMF.attention.model_zoo.Ensemble import EnsemblePBMF

# Setup ensemble
pbmf = EnsemblePBMF(
    time=time, 
    event=event,
    treatment=treatment,
    stratify=treatment,
    features = features,
    discard_n_features=1, # discard n features on each PBMF model
    architecture=Net, # Architecrture to use, we are using a simple NN.
    **params
)

# Train ensemble model
pbmf.fit(
    data_train, # Dataframe with the processed data
    num_models=10, # number of PBMF models used in the ensemble
    n_jobs=4,
    test_size=0.2, # Discard this fraction (randomly) of patients when fiting a PBMF model
    outdir='./runs/experiment_0/',
    save_freq=100,
)

Once the model is trained, get the predictive biomarker scores and labels is as simple as:

# Load the ensemble PBMF
pbmf = EnsemblePBMF()
pbmf.load(
    architecture=Net,
    outdir='./runs/experiment_0/',
    num_models=10,
)

# Retrieve scores for predictive biomarker positive / negative
data_test['predictive_biomarker_risk'] = pbmf.predict(data_test, epoch=500)
# Generate biomarker positive and negative labels
data_test['predicted_label'] = (data_test['predictive_biomarker_risk'] > 0.5).replace([False, True], ['B-', 'B+'])

PBMF demos

Under ./demos/ you will find a complete guide on how to use the framework.
under ./demos/app you can find the app for visualizing the distilation trees and interpretability.
under ./demos/simulation we have an example on how to build synthetic survival datasets.

System Requirements

Hardware requirements

The PBMF can be run in standard computers with enough RAM memory. PBMF is efficient when running on multiple cores to perform parallel trainings when setting a large number of models (num_models).

The PBMF runs in Python > 3 and has been tested on MacOS and Linux Ubuntu distributions.

Software requirements

This python package is supported for macOS and Linux. The PBMF has been tested on the following systems using docker and singularity containers:

OS requirements

macOS: Sonoma
Linux: Ubuntu 18.04 LTS
Windows: WSL2 / ubuntu / x86_64

Python dependencies

PBMF was extensively tested using the following libraries:

tensorflow==2.6.0
scipy==1.5.4
numpy==1.19.5
scikit-learn==0.24.1
pandas==1.1.5
seaborn==0.11.1

The PBMF has been also tested with latest updates of the listed libraries.

Installation guide

Basic installation

pip install tensorflow==2.6.0
pip install scipy==1.5.4
pip install numpy==1.19.5
pip install scikit-learn==0.24.1
pip install pandas==1.1.5
pip install seaborn==0.11.1 
pip install --no-cache-dir git+https://github.com/gaarangoa/samecode.git
pip install --no-cache-dir git+https://github.com/gaarangoa/pbmf.git

Docker container

The easiest way to get started with the PBMF is to run it through a docker container. We have created an image with all necessary libraries and these containers should seamlessly work.

For macOS ARM processors:

    # Download the PBMF repository
    git clone https://github.com/gaarangoa/pbmf.git
    cd ./pbmf/

    # Build the docker image
    docker pull gaarangoa/ml:v2.1.0.1_ARM
    docker build -f Dockerfile.arm . --tag pbmf

    # Launch a jupyter notebook
    docker run -it --rm -p 8888:8888 pbmf jupyter notebook --NotebookApp.default_url=/lab/ --ip=0.0.0.0 --port=8888 --allow-root

For x86-64 processors:

    # Download the PBMF repository
    git clone https://github.com/gaarangoa/pbmf.git
    cd ./pbmf/

    # Build the docker image
    docker pull gaarangoa/dsai:version-2.0.3_tf2.6.0_pt1.9.0
    docker build -f Dockerfile.x86-64 . --tag pbmf

    # Launch a jupyter notebook
    docker run -it --rm -p 8888:8888 pbmf jupyter notebook --NotebookApp.default_url=/lab/ --ip=0.0.0.0 --port=8888 --allow-root

Dependencies for manuscript experiments

All experiments in the manuscript were performend in our internal HCP. We used multiple nodes with 100 cores for running the PBMF in parallel. No GPU acceleration was enabled. The HCP used Ubuntu 18.04. For each run we deployed docker containers using singularity version=3.7.1 the image used is available at docker hub (gaarangoa/dsai:version-2.0.3_tf2.6.0_pt1.9.0).

License

The code is freely available under the MIT License

Citation

If you use this work in any form, please cite as follows:

@article{arango2025ai,
  title={AI-driven predictive biomarker discovery with contrastive learning to improve clinical trial outcomes},
  author={Arango-Argoty, Gustavo and Bikiel, Damian E and Sun, Gerald J and Kipkogei, Elly and Smith, Kaitlin M and Pro, Sebastian Carrasco and Choe, Elizabeth Y and Jacob, Etai},
  journal={Cancer Cell},
  year={2025},
  publisher={Elsevier}
}

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.ipynb_checkpoints		.ipynb_checkpoints
PBMF		PBMF
data		data
demos		demos
.DS_Store		.DS_Store
.gitignore		.gitignore
Dockerfile.arm		Dockerfile.arm
Dockerfile.x86-64		Dockerfile.x86-64
LICENSE		LICENSE
README.md		README.md
pbmf.py		pbmf.py
setup.py		setup.py
track.gif		track.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Predictive Biomarker Modeling Framework (PBMF)

Quick tour

PBMF demos

System Requirements

Hardware requirements

Software requirements

OS requirements

Python dependencies

Installation guide

Basic installation

Docker container

For macOS ARM processors:

For x86-64 processors:

Dependencies for manuscript experiments

License

Citation

About

Uh oh!

Releases

Packages

Languages

License

gohweixun/pbmf

Folders and files

Latest commit

History

Repository files navigation

Predictive Biomarker Modeling Framework (PBMF)

Quick tour

PBMF demos

System Requirements

Hardware requirements

Software requirements

OS requirements

Python dependencies

Installation guide

Basic installation

Docker container

For macOS ARM processors:

For x86-64 processors:

Dependencies for manuscript experiments

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages