Skip to content

Minimum Viable Product Review

Laurence Livermore edited this page Apr 6, 2022 · 3 revisions
Number Description SYNTHESYS+ DiSSCo Notes
2.1 A user can submit an image (or images) for de novo digitisation, or for refinement of data post-initial digitisation. Yes
2.2 Minimum requirements for specimen submission will depend on selected workflow components. Minimal data input for de novo digitisation will be: Image URI Material type: [Pinned Insect, microscope slide, herbarium sheets] Image license Strong recommendation: Higher taxon [rank TBC] Yes
2.3 Additional data can be submitted with the initial object, with fields pre-filled with data that could otherwise be populated by workflow tasks. E.g. 4.4 Geocoding could use geospatial data submitted with the SDO. Maybe Separate workflow for populated entries
2.4 Each component will input and output the specimen data object. Yes
2.5 Specimen data object will be mutable. Modifications will be made directly, with changes tracked in the provenance data. Yes* *Dependent on core Galaxy team development work.
2.6 Each component’s modification to the specimen data object will be logged as part of the specimen data object. Yes* *Dependent on core Galaxy team development work.
2.7 Each component will be self-descriptive - API returning component description, required input fields and output fields. Maybe Considered a "could do" priority
2.8 Users may construct arbitrary pipelines from a set of predefined components, which can be chosen from a component registry (Galaxy Toolshed). The minimum set of components is defined by 4) in this document Partial Can contribute to Toolshed, don't worry too mucch about arbitary pipelines
Multiple specimen objects can be submitted in batches. Yes
2.9 A user can input other, non-image data from already digitised specimens for refining (e.g. text) In this use case, the workflow would not use components that required image URI. Maybe Related to 2.3
2.10 SDR output (pre- adapter) must conform to MIDS specification. If a MIDS required field isn’t populated by a workflow, it becomes a required field at submission (currently: Modified, midsevel, physicalSpecimenId, Institution) Maybe Requires further discussion. Unclear on benefits for doing this. Not contractual but desirable for DiSSCo.
2.11 A user will be able to export a specimen data object from a workflow that can be consumed by their collection management system (NHM and other able task partners will be used as an example of this). Yes A flattened CSV of outputs should be adequate. Requires testing with users and systems.
2.12 When constructing workflows, the user interface will display data on which fields will be populated and required by each component. Yes Galaxy default?
2.13 A per-specimen workflow iteration will fail if a required data item isn’t available for a downstream component. If a component is unable to identify data required by a downstream component, individual specimen validation failures will be attached to the specimen object error log. Yes Whole batch should not fail if individual SDOs are not fully valid - check status
2.14 A specimen object can be submitted with predefined regions of interest (ROI) for the specimen image. Maybe
2.15 Components may have additional (optional) inputs not specified in the SDO Yes must for mvp (allow arguments to blocks e.g. specific desired analysis)
2.16 Validation will occur at point of submission, using i/o data requirements for each component. Partial
2.17 SDOs can be submitted via web form and API. Partial Need to investigate Galaxy's API submission.
2.18 Specimen data will be modelled on openDS standard, with extensions for: ALTO - Image ROIs Provenance - to include any mutations performed on the openDS object and labels of accuracy. Yes* (without ALTO), capture minimal run prov
2.19 Components with predictive data outputs will return a corresponding confidence score. Yes
3.1 Each tool should be Dockerised and capable of running as a standalone service. Yes
3.2 SDR will use Open Digital Specimen recommendation for image path data (TBC: IIIF). Maybe oDS Standard not yet finalised
3.3 Images will be downloaded once and stored on a staging server. Maybe
3.4 Image regions of interest (ROIs) will be described using ALTO standard. No DiSSCo and Teklia did not consider ALTO suitable.

Clone this wiki locally