data processing that were not produced by 10X genomics #1926

dahun73 · 2025-04-11T05:11:09Z

dahun73
Apr 11, 2025

Hello!

Thank you for serving such a nice tool in advance.
Actually, I'm trying to process scATAC-seq data were not produced by 10X genomics (BICCN data).

I have 2 questions about processing these data.

1. peak calling with MACS2

I saw your peak calling script in this github(https://github.com/timoast/BICCN/blob/master/code/callpeaks.R).
When doing peak calling here, the author only set the genome size, but isn't it correct to set it to BEDPE in the file format?
I'm wondering if there's a specific reason you didn't use the other options.

2. Filtering criteria
As you know, there is no metadata when processing manually. So, I cannot determine how to filter low quality cells now.

I tried to find any filtering parameters in this github(https://github.com/timoast/BICCN). In original paper, there is no filtering criteria in method part.

BICCN dataset processing
We downloaded FASTQ files for the BICCN dataset from NeMO (https://nemoarchive.org/) and mapped the reads to the mm10 genome using BWA-MEM69. We created a fragment file from the aligned BAM file using sinto (https://github.com/timoast/sinto) and tabix44. We then identified peaks for each brain region using MACS2 (ref. 46) using the CallPeaks function in Signac, with the parameters effective.genome.size = 1.87 × 109. We filtered out peaks with a score <150 to remove low-confidence peaks, resulting in a total of 263,815 peaks. Code to produce the BICCN fragment file and unified peak set is available at https://github.com/timoast/BICCN.

We then quantified the number of fragments overlapping each peak for each cell using the Signac FeatureMatrix function. We retained all cells that were retained in the analysis performed by the original authors of the BICCN dataset53. We reduced dimensionality using LSI and UMAP as described above using LSI components 2 to 100.

Are there any tips for filtering BICCN dataset? or Can I calculate and add metadata for myself?

Thank you

Dahun

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

data processing that were not produced by 10X genomics #1926

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

data processing that were not produced by 10X genomics #1926

Uh oh!

Uh oh!

dahun73 Apr 11, 2025

Replies: 0 comments

dahun73
Apr 11, 2025