Skip to content

Problem: Enduro database must be persisted forever for long term access #220

@joel-simpson

Description

@joel-simpson

Context: In the NHA implementation, the access system uses the Enduro API to retrieve AIPS. This is because there are multiple piplines; rather than have the Access system register the multiple pipelines and track which pipeline created an AIP, we rely on Enduro (which already does this anyway).

This means that if the Enduro database is ever cleared out or lost, the Access system will no longer be able to retrieve AIPs. This is mitigated by taking backups of Enduro, but NHA would prefer to have a better solution for long term preservation; which is to have the capability to re-create (or re-index) the Enduro database from the Storage service.

We already have similar scripts that allow the storage service to be rebuilt / re-indexed from actual AIPs. So this suggestion is basically the same idea but for Enduro.

There may be a maintenance benefit if we also look at what data Enduro actually needs to retain to maintain access to AIPs. Right now, there is no way to clear out the Enduro database because access to AIPs is needed forever. Ideally we'd be able to only retain core information about AIPs to support access... and the script or tool mentioned above could be used to only re-index the essential data. We could then run regular maintenance on Enduro to clean out (most of) the data in it's database, which will be important as the db grows in size over time.

(note this script could be a separate standalone tool)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Severity: lowAn inconvenient situation where the software is usable but could be better.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions