Skip to content

Generate reports for CosMx data #10

@dorien-er

Description

@dorien-er

Goal: update the existing workflows/generate_qc_report workflow to accept and run a report for CosMx H5MU

Implementation

  • Update the resources_test_script/spatial_qc_sample_data.sh to generate cosmx data with QC metrics

  • Update the src/ingestion_qc/h5mu_to_qc_json component:

    - `--ingestion_method` should include `cosmx`
    - Final JSON should include standard QC metrics (total counts, nonzero vars, fraction mitochondrial, fraction ribosomal)
    - Final JSON should include spatial coordinates (found in .obsm["spatial"])
    - Final JSON should include spatial QC metrics (found in .obs): `Area`, `AspectRatio`, `Mean.MembraneStain`, `Max.MembraneStain`, `Mean.PanCK`, `Max.PanCK`, `Mean.CD45`, `Max.CD45`, `Mean.CD3`, `Max.CD3`, `Mean.DAPI`, `Max.DAPI`
    
  • Update the siqc reporting engine (https://github.com/openpipelines-bio/siqc) to accept a CosMx-based JSON and generate a report including CosMx QC metrics.

    Default visualizations in the report:
    
    - Total counts histogram (per sample per cell)
    - Non-zero vars histogram (per sample per cell)
    - Mitochondrial fraction histogram (per sample per cell)
    - Ribosomal fraction histogram (per sample per cell)
    - Cell area histogram (per sample per cell)
    - Aspect ratio histogram (per sample per cell)
    - DAPI intensity histogram (per sample per cell)
    - MembraneStain intensity histogram (per sample per cell)
    
    Optional visualizations in the report:
    
    - CD45 intensity histogram (per sample per cell)
    - CD3 intensity histogram (per sample per cell)
    - PanCK intensity histogram (per sample per cell)
    

Data

For development: s3://openpipelines-bio/openpipeline_spatial/resources_test/cosmx/Lung5_Rep2_tiny.h5mu
For running example report: https://nanostring-public-share.s3.us-west-2.amazonaws.com/SMI-Compressed/Lung5_Rep2/Lung5_Rep2+SMI+Flat+data.tar.gz

Test run/example report

Once finalized, run the reporting workflow on Seqera Cloud on the full-sized dataset (see above).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions