Skip to content

DRS Write Support #415

@briandoconnor

Description

@briandoconnor

About

DRS write was a feature highlighted during the Plenary in 2024 as well as the April 2025 Connect meeting. See DRS 1.6 priorities whiteboard.

This also came up during the July 2025 eLwazi GA4GH Hackathon. The primary use case is the eLwazi GIF project which looks to have multiple imputation nodes in a federated network. They need to be able to provide back resulting output files and writing these back to a DRS server would be a convenient way to collect results across a federated network:

Image

Having DRS write in 1.6, as an optional extension to DRS, would be really helpful in federated networks like this where analysis happens in a distributed fashion but outputs are consolidated back to a central DRS location.

Feature Branch

See https://github.com/ga4gh/data-repository-service-schemas/tree/feature/issue-415-write-support

For documentation, see docs built from the feature branch: https://ga4gh.github.io/data-repository-service-schemas/preview/feature/issue-415-write-support/docs/

Possible Solutions

All DRS server implementations to date have some out-of-band mechanism to create new DRS entries. To date, there has been no effort to consolidate the mechanism that objects are added to DRS servers. Several approaches could be taken, here's some ideation from Claude Code:

Approach 1: RESTful CRUD Operations

Extend existing endpoints with HTTP methods:

  • POST /objects - Create new DRS object
  • PUT /objects/{object_id} - Update existing DRS object
  • DELETE /objects/{object_id} - Delete DRS object
  • PATCH /objects/{object_id} - Partial updates

Pros: Follows REST conventions, intuitive for developers
Cons: May require significant changes to existing client libraries

Approach 2: Separate Write API Namespace

Create dedicated write endpoints:

  • POST /write/objects - Create objects
  • PUT /write/objects/{object_id} - Update objects
  • DELETE /write/objects/{object_id} - Delete objects

Pros: Clear separation between read/write operations, easier to implement optional support
Cons: API surface area expansion, potential confusion about which endpoints to use

Approach 3: Multipart Upload with Metadata

Support data + metadata in single operation:

  • POST /objects with multipart form data containing both file content and DRS metadata
  • Support for resumable uploads for large files
  • Automatic checksum generation and validation

Pros: Atomic operation, handles both data and metadata together
Cons: More complex implementation, may not fit all storage backends

Approach 4: Two-Phase Write Process

Separate metadata creation from data upload:

  1. POST /objects - Create DRS object metadata, returns upload URLs
  2. Client uploads data to returned URLs (could be cloud storage URLs)
  3. POST /objects/{object_id}/finalize - Mark upload complete

Pros: Flexible for different storage backends, supports large file uploads
Cons: More complex client flow, potential for orphaned metadata

Approach 5: Bulk Operations Extension

Extend existing bulk operations pattern:

  • POST /bulkobjects/write - Create multiple objects
  • Support for batch metadata + file references
  • Leverage existing bulk patterns in DRS 1.4+

Pros: Consistent with existing bulk operations, efficient for federated scenarios
Cons: May be overkill for single object operations

Cross-Cutting Considerations

Authorization & Permissions:

  • How to handle write permissions differently from read permissions
  • Integration with existing passport/authorization mechanisms
  • Scope-based access control for write operations

Versioning & Immutability:

  • Whether DRS objects should be immutable (create new versions) vs mutable
  • How to handle updates to existing objects that may be referenced elsewhere
  • Version history and rollback capabilities

Storage Backend Flexibility:

  • Support for direct uploads vs proxy uploads
  • Cloud storage integration (S3, GCS, Azure presigned URLs)
  • On-premise storage considerations

Error Handling & Validation:

  • Checksum validation during upload
  • Rollback mechanisms for failed operations
  • Progress reporting for large uploads

Optional Extension Design:

  • How servers advertise write capability support
  • Graceful degradation for read-only implementations
  • Feature discovery mechanisms

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions