Genomics Analysis Developer Example

Easily run essential genomics workflows to save time leveraging Parabricks and CodonFM.

Overview
Experience Workflow
- Architecture Diagram
- Notebook Outline
How to Run
References
Terms of Use
Ethical Considerations

Overview

This developer example enables bioinformaticians to run GPU-accelerated genomics workflows in minutes on any cloud through Brev.dev. NVIDIA® Parabricks® powers both linear and graph-based read alignment along with variant calling via DeepVariant. CodonFM, NVIDIA's RNA foundation model, can then be used to predict the functional impact of each detected variant on specific genes.

Experience Workflow

This developer example shows how to use GPU accelerated tools for alignment (linear and graph), variant calling, and variant effect prediction.

Architecture Diagram

The exact steps to run this workflow are outlined below:

Notebook Outline

All the code can be found in Jupyter notebooks in the notebooks directory of the Github repo.

`germline_wes.ipynb`

Runs a standard germline variant calling workflow on whole exome sequencing (WES) data. Downloads the NA12878 sample from the Genome in a Bottle consortium, aligns reads to the GRCh38 reference using GPU-accelerated BWA-MEM via Parabricks fq2bam, and calls variants with GPU-accelerated DeepVariant, producing a final .vcf file.

`pangenome.ipynb`

Demonstrates a pangenome analysis workflow as an alternative to single-reference alignment. Downloads the HPRC v1.1 pangenome graph, aligns short-read FASTQ samples using GPU-accelerated Giraffe, and calls variants with Pangenome-Aware DeepVariant — a variant of DeepVariant that uses the pangenome graph to improve alignment accuracy and variant detection across diverse populations.

`variant_effect_prediction.ipynb`

Runs a full variant effect prediction pipeline starting from raw FASTQ files. Uses Parabricks to align reads and call variants, processes GENCODE gene annotations to extract protein-coding sequences, maps detected variants onto transcripts, and uses CodonFM (NVIDIA's RNA foundation model) to predict the functional impact of each variant via log likelihood ratios.

How to Run

Hardware Requirements

The L40s with at least 48GB of GPU memory is recommended for the best combination of cost and performance. Users can also try L4 or T4 (better cost) or RTX Pro 6000 (better performance).

NVIDIA Parabricks can be run on any NVIDIA GPU that supports CUDA® architecture 75, 80, 86, 89, 90, 100, or 120 and has at least 16GB of GPU RAM.

Parabricks has been tested specifically on the following NVIDIA GPUs:

T4
A10, A30, A40, A100, A6000
L4, L40
H100, H200
GH200
B200, B300
GB200, GB300
RTX PRO 6000 Blackwell Server Edition
RTX PRO 4500
DGX Spark
DGX Station

The minimum amount of CPU RAM and CPU threads depends on the number of GPUs. Please refer to the table below:

GPUs	Minimum CPU RAM (GB)	Minimum CPU Threads
2	100	24
4	196	32
8	392	48

Software Requirements

Any NVIDIA driver that is compatible with CUDA 12.9 (535, 550, 570, 575, or similar). Please check here for more details on forward compatibility.
Any Linux operating system that supports Docker version 20.10 (or higher) with the NVIDIA GPU runtime.

Pre-configured Instances

These notebooks are available as a launchable on Brev. This is a one-click method, that automatically installs dependencies, provisions hardware, and loads this repository.

Manual installation

For users who prefer to run on their own hardware, installation instructions are provided below:

Prerequisites: Python3

# Create Python virtual environment and activate it
python3 -m venv .venv 
source .venv/bin/activate

# Run the setup script 
./scripts/local_setup.sh

# Start Jupyter lab 
jupyter lab

References

Terms of Use

Governing Terms: The Parabricks container is governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products. This Genomics Analysis Blueprint github repository is provided under Apache License 2.0.

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns here.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
.github		.github
CodonFM @ 705b573		CodonFM @ 705b573
images		images
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Parabricks Developer Blueprint OSS License - SW Components.pdf		Parabricks Developer Blueprint OSS License - SW Components.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Genomics Analysis Developer Example

Overview

Experience Workflow

Architecture Diagram

Notebook Outline

`germline_wes.ipynb`

`pangenome.ipynb`

`variant_effect_prediction.ipynb`

How to Run

Hardware Requirements

Software Requirements

Pre-configured Instances

Manual installation

References

Terms of Use

Ethical Considerations

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Genomics Analysis Developer Example

Overview

Experience Workflow

Architecture Diagram

Notebook Outline

germline_wes.ipynb

pangenome.ipynb

variant_effect_prediction.ipynb

How to Run

Hardware Requirements

Software Requirements

Pre-configured Instances

Manual installation

References

Terms of Use

Ethical Considerations

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`germline_wes.ipynb`

`pangenome.ipynb`

`variant_effect_prediction.ipynb`

Packages