HiArch

Measure the high-order genome architectures (global folding and checkerboard) from Hi-C experiment.

Citation:

Modules

We provide a one-click bash file to compute the strength of large-scale genome architectures, global folding and checkerboard.

Our algorithm comprises three key modules: normalization, global folding, and checkerboard.

NormDis & CorrectMap: Raw Hi-C maps are scaled to comparable sizes and normalized to remove distance-dependent biases. The resulting maps are subsequently utilized to calculate global folding and checkerboard scores. Notably, you can manually check the normalized maps to remove poor-assembled chromosomes, detailed in CorrectMap.
Global folding: Based on normalized maps, the computation of global folding score involves two sub-modules: detecting center anchors (GF_S1_get_center) and calculating the global folding scores (GF_S2_get_score). You can re-choose the alternative center anchors, detailed in GF_S1_get_center.
Checkerboard: Checkerboard scores are calculated based on normalized maps.

Requirements

Python 3.7+ Seaborn Scipy Torch Scikit-learn

One-Click Pipeline Usage Guide

We provide a one-click pipeline script (one_click_pipeline.sh) for automated Hi-C data analysis. The core requirement for its execution is to properly organize input files within the designated base_path directory.

1. Directory Structure

The following directory tree must be created under your base_path:

base_path/
├── [species_name_1]/
│   └── sps_mtx/
├── [species_name_2]/
│   └── sps_mtx/
└── parameters.txt

Steps:

Create your main base_path directory.
Inside base_path, create a sub-directory for each species you wish to analyze (e.g., human/, mouse/).
Inside each species directory, create a sub-sub-directory named sps_mtx/. This directory will contain all the input files for the samples belonging to that species.

2. File Preparation

Place the following two types of files for each sample inside the corresponding sps_mtx/ directory.

2.1 Sparse Matrix File (`.mtx`)

Purpose: Contains the Hi-C contact data.
File Naming: <sample>_normalized.mtx
- <sample> is a unique identifier for the biological sample (e.g., sample1_normalized.mtx, rep2_normalized.mtx).
File Format: A three-column, whitespace-separated text file.
- Column 1: Row index (integer). Must be consistent with the index file (0-based or 1-based).
- Column 2: Column index (integer).
- Column 3: Contact value (float).
Note: It is recommended to use pre-normalized contact matrices (e.g., using ICE or Knight-Ruiz (K) normalization) as input.

2.2 Index File (`.window.bed`)

Purpose: Provides the genomic coordinates for each bin (row/column) in the .mtx file.
File Naming: <sample>.window.bed
- The <sample> prefix must match the corresponding .mtx file but without the _normalized suffix.
- Example: For sample1_normalized.mtx, the index file must be named sample1.window.bed.
File Format: A four-column, whitespace-separated file in standard BED format.
- Column 1: Chromosome name (string).
- Column 2: Region start position (integer, 0-based).
- Column 3: Region end position (integer).
- Column 4: Index (integer). This number corresponds to the row/column index in the associated .mtx file.
Generation: This file can be created using tools like bedtools makewindows.

3. Complete Example

A correctly organized base_path directory will look like this:

base_path/
├── human/
│   └── sps_mtx/
│       ├── sample1_normalized.mtx
│       ├── sample1.window.bed
│       ├── sample2_normalized.mtx
│       └── sample2.window.bed
├── mouse/
│   └── sps_mtx/
│       ├── mouse_sample1_normalized.mtx
│       └── mouse_sample1.window.bed
└── parameters.txt   (parameter file, placed directly in base_path)

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
publish		publish
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HiArch

Modules

Requirements

One-Click Pipeline Usage Guide

1. Directory Structure

2. File Preparation

2.1 Sparse Matrix File (`.mtx`)

2.2 Index File (`.window.bed`)

3. Complete Example

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HiArch

Modules

Requirements

One-Click Pipeline Usage Guide

1. Directory Structure

2. File Preparation

2.1 Sparse Matrix File (.mtx)

2.2 Index File (.window.bed)

3. Complete Example

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2.1 Sparse Matrix File (`.mtx`)

2.2 Index File (`.window.bed`)

Packages