Skip to content

NodeFSStorageAdapter does not preload any document #397

Open
@Daru13

Description

I am developing a server that stores the current state of multiple repositories on the disk using NodeFSStorageAdapter and exposes their content, i.e., the documents it contains, to its clients. This is working great as long as the documents are created while the server is running. However, whenever I stop the server and start it again, the server seems to be unaware of the locally available content of the repositories: no existing document seems to be (pre)loaded by NodeFSStorageAdapter unless I manually request them using Repo.find. This, in turn, requires knowing the IDs of all the document of the repository that exist locally. Yet, I did not find any good way to get that information using the current APIs of automerge-repo and the storage adapter I'm using.

My current workaround is to make some assumptions on how NodeFSStorageAdapter stores the data on the disk and to go through the two-level hierarchy of directories it uses to encode the document IDs immediately after creating the repository:

import type * as A from "@automerge/automerge-repo";
import * as fs from "node:fs";

function loadAutomergeDocumentsFromFiles(
    repo: A.Repo,
    pathToStorageDirectory: string
): void {
    // To retrieve all the documents IDs, we assume that NodeFSStorageAdapter
    // stores them in the given root directory under the following structure:
    // 
    // <root directory>
    //   └ <first two characters of the ID>         (1st level directories below)
    //       └ <remaining characters of the ID>     (2nd level directories below)
    //           └ ...
    // 
    // Moreover, all repositories written on disk with this utility seem to
    // store an entry called "storage-adapter-id" (under st/orage-adapter-id),
    // which must therefore NOT be treated as a document ID.

    const automergeDocumentIds = new Set<string>();

    const firstLevelDirectoryNames = fs.readdirSync(pathToStorageDirectory);
    for (let firstLevelDirectoryName of firstLevelDirectoryNames) {
        const secondLevelDirectoryNames = fs.readdirSync(`${pathToStorageDirectory}/${firstLevelDirectoryName}`);
        for (let secondLevelDirectoryName of secondLevelDirectoryNames) {
            automergeDocumentIds.add(`${firstLevelDirectoryName}${secondLevelDirectoryName}`);
        }
    }

    automergeDocumentIds.delete("storage-adapter-id");

    for (let id of automergeDocumentIds) {
        try {
            repo.find(id as A.DocumentId);
        }
        catch (error) {
            console.warn(`Error while preloading Automerge document ${id}.`);
        }
    }
}

...but it feels a bit hacky, since this code is relying on implementation details.

It would be great to have an "official" way to automatically preload all the documents of a repository that are locally available, or, at the very least, to get the list of IDs of all the documents that are stored on the disk.

For example, I imagine that it could take the form of a flag passed to the constructor of NodeFSStorageAdapter, and/or of a new (possibly optional) method to preload documents in the StorageAdapterInterface interface (which may, in turn, be exposed as a flag when creating a repository, regardless of the storage adapter that is actually being used).

Has anyone else faced a similar problem/found a better solution? If we reach an agreement on a solution, I can look into it and create a PR :).

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions