Skip to content

Metadata standards #22

@beatrizserrano

Description

@beatrizserrano

I’d like to point out some issues that I’ve found related to the lack of metadata standards:

  • The number of columns is heterogeneous, e.g. "Gene.Symbol", "Gene.Identifier", "Control.Comments"... are not present in all the datasets.
  • The negative control labels are in at least two different columns of the metadata (Control.Type, Control.Comments).
  • The negative controls are labelled differently across the datasets. Labels such as "DMSO drug control", "untreated cells (EMPTY)", "non-targetting siRNA", "scrambled siRNA" and "negative control" represent the negative controls in different formats.
  • For the compound screen idr0016, the label “POS” in the column “Compound.Group” refers to the positive controls.
  • The positive controls are sometimes in certain wells that are not described in the metadata but in the original publication. For instance, MitoCheck (idr0013) and the secretion screen (idr0009) share the same layout in which the reagent ids 14851, 28431 are positive controls, together with the wells 1,4,49,52,290,301,338,349,363,333,336,381 and 384. In CellMorph, the positions 4I and 4J are also positive controls.
  • The plate content is redundant sometimes (e.g. idr0033, plateName vs plateName_illum_corrected). The image ids are different in both plates, although the illumination of the images seems to have been corrected.
  • The quality controls are represented with different codes pass/fail, TRUE/FALSE, ""/pass...

Maybe some of the keys have been standardized in the API but I couldn't find them (see issue #21).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions