Skip to content

Allow optional step in BackupClientInterface after a time periods substream is complete #22

Open
@mdedetrich

Description

@mdedetrich

After speaking with @mkoskinen, he made a valid point in that there may be a usecase where some kind of object storage (or more specifically an implementation of BackupClientInterface) is so basic/simple that it doesn't support functionality such as resume. A way around this would be to use a BackupClientInterface that supports resuming (even a flat file storage) and then after each substream for that object/key/file is complete you can them upload it to S3/GCS/whatever as a whole.

This can also solve other problems, for example currently we don't compress the .json file (using something like gz) while streaming because we can't find a resume point from a compressed object/key/file. This approach would allow you to backup the stream to an initial storage and then after its finished compress it and send it to S3/GCS/whatever.

On first impressions the implementation can be adding a single method to the BackupClientInterface that returns an Option[Sink] where if its defined, this sink gets executed after a backup is complete. One considering is whether the sink can be run asynchronously or synchronously (ideally as a parameter in the method itself), i.e.

def afterBackupSink(async: Boolean): Option[Sink]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions