Hi,
Thank you for a really great tool and an exciting dataset! I'm trying to reproduce the analysis done in the Nature Biotech paper (https://www.nature.com/articles/s41587-022-01250-0). I've downloaded the processed data from Zenodo (https://zenodo.org/record/5504061), and also compiled schrom and successfully ran the small example provided with the code.
Now, I'm trying to re-do the analysis done in the paper, and have few questions:
- from what I understand, to generate the model file used by scChromHMM, I need to run bulk ChromHMM. I've read the paper methods and see that ChromHMM was ran on "level 2 annotation" - thus, I think it was ran using same 12 states on the each cell type? It's also not clear how many cells were used for pseudobulk aggregation to generate input for the bulk ChromHMM. Also, which model file is used, since it seems like each cell type pseudobulk should generate its own model?
- how exactly are anchor files generated from the integrated Seurat objects? The integrated RDS files don't seem to contain barcodes that look like
E2L4_AGCGTATCACAGTCCG which I assume are scCut&Tag-pro barcodes - what am I missing? Also, what exactly is the numerical value in the 3rd column of the anchor file that's fed to scChromHMM?
Thank you in advance - any help would be much appreciated!
All the best,
-- Alex
Hi,
Thank you for a really great tool and an exciting dataset! I'm trying to reproduce the analysis done in the Nature Biotech paper (https://www.nature.com/articles/s41587-022-01250-0). I've downloaded the processed data from Zenodo (https://zenodo.org/record/5504061), and also compiled
schromand successfully ran the small example provided with the code.Now, I'm trying to re-do the analysis done in the paper, and have few questions:
E2L4_AGCGTATCACAGTCCGwhich I assume are scCut&Tag-pro barcodes - what am I missing? Also, what exactly is the numerical value in the 3rd column of the anchor file that's fed to scChromHMM?Thank you in advance - any help would be much appreciated!
All the best,
-- Alex