Skip to content

Best practice for using DRS and Data Connect together #394

@briandoconnor

Description

@briandoconnor

Background

Feature Branch: https://github.com/ga4gh/data-repository-service-schemas/tree/feature/issue-394-drs-plus-connect-docs-v1

I'm opening this issue based on followup to the April 20th, 2023 GA4GH Connect meeting "DRS and Data Connect" session. This session looked at exploring how standards from the Cloud and Discovery work streams can be used together to identify the two needs identified in the aims listed below:

  • Address the need to obtain additional data about a DRS object
  • Revisit how Data Connect handles the need for bundles

Some resources of interest:

Key Takeaways from GA4GH Connect

Metadata + DRS

We agreed that best practices for working with metadata were important, and largely agreed on two guiding principles:

    1. DRS doesn’t know about metadata, and shouldn’t. Instead, we should lean into the fact that systems that use DRS typically have some database-like component that does know about object metadata.
    1. No new APIs (or API changes for DRS) are needed. Instead, we should add an appendix to the DRS spec documenting best practices for building systems that use DRS and care about metadata.

Compound Objects

We agreed with the way the DRS 1.3.0 develop branch frames the need for compound object support:

  • Some content (e.g. DICOM images) is best represented as a compound object consisting of a structured collection of atomic DrsObjects.
  • Each compound object should have a DRS ID, that clients can use to retrieve the object structure and its constituent atomic objects.
    We discussed two possible ways to represent and retrieve compound object contents, but didn’t have time to discuss their tradeoffs:
    1. The approach documented in the develop branch (Best Practice: Manifests), where the compound object’s DRS ID provides access to a manifest file listing the object contents. Manifest format is datatype-specific and outside the scope of the DRS spec (but could for example be a JSON file).
    1. An alternate approach where the compound object’s DRS ID provides access to a Data Connect table describing the object contents. Table format is datatype-specific and outside the scope of the DRS spec.

Goal for this Issue

This issue is to give us a place to discuss the use of Data Connect and DRS together (and link PRs to). The immediate goal of this Issue is to get a corresponding PR that addresses the best practice of using Data Connect together with DRS to provide 1) more metadata about DRS objects and 2) a scalable alternative to bundles. The intention is a documentation only change with a best practice appendix to the DRS spec.

Metadata

Metadata

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions