Skip to content

[BUG] CZ CELLxGENE Discover platform fails to access gene data. #1231

Description

@gennadyFauna

Describe the bug
The main platform, hosted at https://cellxgene.cziscience.com, recently broke in several ways. Please do tell me if this should be reported elsewhere. So far, I have noticed two key issues:

  • Differential expression fails to return results.
  • The platform fails to display genes present in data, and appears to be biased toward miscellaneous RNA species.

To Reproduce
This example is for a single dataset I happen to have on hand, but I see the same issues in spot checks in other datasets.

Steps to reproduce the DE bug:

  1. Go to the HYPOMAP dataset. Open the dataset using "Explore".
  2. Select any two distinct populations of cells.
  3. Press "see top differentially expressed genes."
  4. The outputs under "Pop 1 high" and "Pop 2 high" read No genes to display.

Steps to reproduce the present genes bug:

  1. Download the HYPOMAP dataset. Consider the top genes by total expression, or a standard marker (SNAP25). These genes will not show up.
  2. So far as I can tell, this is not a split by feature_type. Based on spot checks, some protein-coding genes (OR4F16, HES4, ISG15, etc.) appear on the platform.
  3. Based on spot checks, the following features do not appear on platform: processed_pseudogene, snRNA, miRNA, unprocessed_pseudogene, misc_RNA (although there is a vast number of undifferentiated Y_RNA entries), transcribed_unprocessed_pseudogene (although SLC35E2A shows up), rRNA_pseudogene, rRNA, Mt_tRNA.
  4. The following features do appear: lncRNA (but not all: a vast number of LINCs show up, but not not MALAT1), snoRNA (but not all: SNORA72 shows up but SNORD118 does not). I stopped looking here, as these categories cover most of the genes.

Expected behavior

  • DE should always work.
  • Genes present in the data matrix should always appear.

Screenshots
Today's version of the dataset (f91ae570a42434e8066d0b512584d711), and for that matter the October one I had on hand (not shown, dd5ffb49acfe2cb437d2fdee541e62df), have the correct genes (here, sorted by total expression).

Image

These genes do not show up under the gene search, whether using gene name or ID.

Image

Various other genes show up.

Image Image

DE between astrocytes and oligodendrocytes in donor sample f5sVM produces no results. This also shows up with all other cell populations in all datasets I have tested.
Image

Version (please complete the following information):

  • Desktop or hosted?: Hosted
  • Browser (if hosted): Safari, Chrome
  • Version [e.g. 0.13.0]: ?

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions