Skip to content

Noncoding variants and sulcal patterns in CHD: ML to predict functional impact

License

Notifications You must be signed in to change notification settings

MortonLabBCH/brain-ncdnv-chd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Noncoding variants and sulcal patterns in congenital heart disease: Machine learning to predict functional impact

MIT License DOI DOI

Overview

This repository contains the code used in the study: Mondragon-Estrada et al., Noncoding variants and sulcal patterns in congenital heart disease: Machine learning to predict functional impact, iScience (2025), https://doi.org/10.1016/j.isci.2024.111707.

The main analyses were:


Requirements

Hardware requirements

Deep learning predictions were carried out in GNU/Linux workstations with 128 GB of RAM and NVIDIA GPUs.

OS requirements

Scripts were executed on GNU/Linux Ubuntu 20.04.6 LTS (Focal Fossa). Deep learning models are implemented in Python3 and their corresponding usage and requirements can be seen in their repositories. WGCNA and GO enrichment analysis were performed in R 4.1.2.


Setup and usage

Basenji2's predictions were obtained using the script predict_scores_b.sh. Instructions for installing Basenji2 and creating a conda environment are available in the original Basenji2 GitHub repository.

Enformer's predictions were obtained using the script predict_scores_e.py. Instructions for installing Basenji2 and creating a virtual environment are available in the original Enformer GitHub repository.

For processing and statistical steps performed in Python3, we recommed creating a third virtual environment with this requirements.txt.

    $ python3 -m venv venv
    $ source venv/bin/activate
    $ pip install -r requirements.txt

For analyses done in R, the following packages are required:

  • Bioconductor packages:

    • WGCNA
    • clusterProfiler
    • biomaRt
    • org.Hs.eg.db
    • GenomicRanges
    • ChIPpeakAnno
    • TxDb.Hsapiens.UCSC.hg38.knownGene
  • CRAN packages:

    • ggplot2
    • dplyr
    • RColorBrewer
    • ppcor

They all can be installed using the following commands in R:

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("WGCNA")
BiocManager::install("clusterProfiler")
BiocManager::install("biomaRt")
BiocManager::install("org.Hs.eg.db")
BiocManager::install("GenomicRanges")
BiocManager::install("ChIPpeakAnno")
BiocManager::install("TxDb.Hsapiens.UCSC.hg38.knownGene")

# For installing ggplot and dplyr, it is easier to install the whole tidyverse:
install.packages("tidyverse") 
# Alternatively, each package can be installed individually with the following commands:
install.packages("ggplot2")
install.packages("dplyr")

install.packages("RColorBrewer")
install.packages("ppcor")

Citation

@article{MONDRAGONESTRADA2025111707,
    author = {Enrique Mondragon-Estrada and Jane W. Newburger and Steven R. DePalma and Martina Brueckner and John Cleveland and Wendy K. Chung and Bruce D. Gelb and Elizabeth Goldmuntz and Donald J. Hagler and Hao Huang and Patrick McQuillen and Thomas A. Miller and Ashok Panigrahy and George A. Porter and Amy E. Roberts and Caitlin K. Rollins and Mark W. Russell and Martin Tristani-Firouzi and P. Ellen Grant and Kiho Im and Sarah U. Morton},
    title = {Noncoding variants and sulcal patterns in congenital heart disease: Machine learning to predict functional impact},
    journal = {iScience},
    volume = {28},
    number = {2},
    pages = {111707},
    year = {2025},
    issn = {2589-0042},
    doi = {https://doi.org/10.1016/j.isci.2024.111707},
    url = {https://www.sciencedirect.com/science/article/pii/S2589004224029341}
}

About

Noncoding variants and sulcal patterns in CHD: ML to predict functional impact

Resources

License

Stars

Watchers

Forks

Packages

No packages published