Skip to content

New pipeline: nf-core/eista #106

@huihai828

Description

@huihai828

Pipeline title/name

eista

Keywords

spatial transcriptomics, Vizgen MERFISH, 10x Xenium

What is it about?

EISTA is a bioinformatics pipeline that performs analysis for single-cell spatial transcriptomics data. The pipeline was developed as a generalized, flexible, and scalable workflow for spatial transcriptomics analysis. It is primarily designed for Vizgen MERFISH data and also support 10x Xenium data. The pipeline consists of three analysis phases. The primary phase includes Vizgen post-processing and converting count matrix into Anndata. The secondary phase focuses on single-cell QC, cell filtering, clustering analysis, and spatial statistical analysis. The tertiary phase involves downstream analyses, such as cell type annotation, differential expression analysis, cellular interaction analysis. Users can run the pipeline end-to-end or execute each analysis phase independently.

Please provide a schematic diagram of the proposed pipeline

nf-core/eista metro map

I confirm my proposed pipeline will follow nf-core guidelines. Most importantly, my pipeline will:

  • be built with Nextflow.
  • pass nf-core lint tests and use standardized parameters.
  • be community-owned and developed within the nf-core organization.
  • open source under the MIT license with proper credits and acknowledgments.
  • have a descriptive, all lowercase, and without punctuation name.
  • use the nf-core pipeline template and predominantly use official nf-core modules.
  • focus on a specific data/analysis type with appropriate scope.
  • have properly maintained documentation.
  • be bundled using versioned Docker/Singularity containers.

Why do we need a new pipeline?

The EISTA pipeline significantly enhances the reproducibility, scalability, and accessibility of spatial transcriptomics data analysis for Vizgen MERFISH. By automating primary to tertiary analyses including preprocessing, spatial statistics, and cell type annotation, etc. It streamlines workflows that are often fragmented or require substantial manual intervention. The modular design empowers users to run or exclude specific analysis stages flexibly, making it adaptable to diverse research needs.

Who would be interested?

Researchers and bioinformaticians working with spatial transcriptomics data.

What has been done so far

The pipeline has been under active development for few months, and has reached version 2.0. It has been tested within our institute, where we have validated its performance across multiple datasets and refined its features based on real analysis needs.

URL to existing work (if applicable)

https://github.com/EarlhamInst/eista

Are there any similar existing nf-core pipelines?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    merge-with-existingOverlap with existing, will add functionality to existing pipelinenew-pipelineproposedUnder active discussion

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions