Skip to content

RFC: Deposit REST API - WIP #22

@lnielsen

Description

@lnielsen

Codimd link for RFC

The current Invenio-Deposit module has several design issues:

  • All records are stored indexed twice:
    • Database: The primary record and deposit record are both stored in the
      records_metadata table.
    • Elasticsearch: Two indexes exists - 1) One for records and 2) One for
      deposits. Almost all records are indexed in both.
  • Records and deposits are mixed in the same database table.
  • Two buckets are used. One for the record, one for the draft. This is due to
    permissions, and ensuring that the preserved files are clearly separate from
    the uploaded file.
  • Unpublished deposits does not expire and stay in the system.
  • Two persistent identifiers exists - recid and depid each pointing to their
    own record.
  • Double JSONSchemas/Mappings: Because of slight differences in records and
    deposits we need two JSONSchemas, and two ES mappings and two marshmallow
    schemas.
  • The Programmatic API is very easily polluted and becomes very hard to
    maintain and extend with custom use cases

New design principles:

  • Clear "physical" separation between records and deposits. Records and
    deposits should not be mixed in the same database table. Recovery of database
    tables are significantly easier if records and deposits are not mixed.
  • Work with a single JSONSchema and single ES mapping.
  • A single persistent identifier.
  • Deposits are drafts and disappear from the system after being published.
  • Support file upload via third-party storage system like S3.
  • Think in versioning support from the beginning.
  • Decoupling workflow from submission

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions