-
Notifications
You must be signed in to change notification settings - Fork 0
Minimum Viable Product Review
Laurence Livermore edited this page Apr 6, 2022
·
3 revisions
| Number | Description | SYNTHESYS+ | DiSSCo | Notes |
|---|---|---|---|---|
| 2.1 | A user can submit an image (or images) for de novo digitisation, or for refinement of data post-initial digitisation. | Yes | ||
| 2.2 | Minimum requirements for specimen submission will depend on selected workflow components. Minimal data input for de novo digitisation will be: Image URI Material type: [Pinned Insect, microscope slide, herbarium sheets] Image license Strong recommendation: Higher taxon [rank TBC] | Yes | ||
| 2.3 | Additional data can be submitted with the initial object, with fields pre-filled with data that could otherwise be populated by workflow tasks. E.g. 4.4 Geocoding could use geospatial data submitted with the SDO. | Maybe | Separate workflow for populated entries | |
| 2.4 | Each component will input and output the specimen data object. | Yes | ||
| 2.5 | Specimen data object will be mutable. Modifications will be made directly, with changes tracked in the provenance data. | Yes* | *Dependent on core Galaxy team development work. | |
| 2.6 | Each component’s modification to the specimen data object will be logged as part of the specimen data object. | Yes* | *Dependent on core Galaxy team development work. | |
| 2.7 | Each component will be self-descriptive - API returning component description, required input fields and output fields. | Maybe | Considered a "could do" priority | |
| 2.8 | Users may construct arbitrary pipelines from a set of predefined components, which can be chosen from a component registry (Galaxy Toolshed). The minimum set of components is defined by 4) in this document | Partial | Can contribute to Toolshed, don't worry too mucch about arbitary pipelines | |
| Multiple specimen objects can be submitted in batches. | Yes | |||
| 2.9 | A user can input other, non-image data from already digitised specimens for refining (e.g. text) In this use case, the workflow would not use components that required image URI. | Maybe | Related to 2.3 | |
| 2.10 | SDR output (pre- adapter) must conform to MIDS specification. If a MIDS required field isn’t populated by a workflow, it becomes a required field at submission (currently: Modified, midsevel, physicalSpecimenId, Institution) | Maybe | Requires further discussion. Unclear on benefits for doing this. Not contractual but desirable for DiSSCo. | |
| 2.11 | A user will be able to export a specimen data object from a workflow that can be consumed by their collection management system (NHM and other able task partners will be used as an example of this). | Yes | A flattened CSV of outputs should be adequate. Requires testing with users and systems. | |
| 2.12 | When constructing workflows, the user interface will display data on which fields will be populated and required by each component. | Yes | Galaxy default? | |
| 2.13 | A per-specimen workflow iteration will fail if a required data item isn’t available for a downstream component. If a component is unable to identify data required by a downstream component, individual specimen validation failures will be attached to the specimen object error log. | Yes | Whole batch should not fail if individual SDOs are not fully valid - check status | |
| 2.14 | A specimen object can be submitted with predefined regions of interest (ROI) for the specimen image. | Maybe | ||
| 2.15 | Components may have additional (optional) inputs not specified in the SDO | Yes | must for mvp (allow arguments to blocks e.g. specific desired analysis) | |
| 2.16 | Validation will occur at point of submission, using i/o data requirements for each component. | Partial | ||
| 2.17 | SDOs can be submitted via web form and API. | Partial | Need to investigate Galaxy's API submission. | |
| 2.18 | Specimen data will be modelled on openDS standard, with extensions for: ALTO - Image ROIs Provenance - to include any mutations performed on the openDS object and labels of accuracy. | Yes* | (without ALTO), capture minimal run prov | |
| 2.19 | Components with predictive data outputs will return a corresponding confidence score. | Yes | ||
| 3.1 | Each tool should be Dockerised and capable of running as a standalone service. | Yes | ||
| 3.2 | SDR will use Open Digital Specimen recommendation for image path data (TBC: IIIF). | Maybe | oDS Standard not yet finalised | |
| 3.3 | Images will be downloaded once and stored on a staging server. | Maybe | ||
| 3.4 | Image regions of interest (ROIs) will be described using ALTO standard. | No | DiSSCo and Teklia did not consider ALTO suitable. | |