Skip to content

Remove document from synchronizer when document is removed from repo #423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

georgewsu
Copy link
Contributor

@georgewsu georgewsu commented May 16, 2025

Implement removal of a document when the document is removed from the repo.

This can happen in two cases:
-when the document is deleted
-when the document is removed from the handle cache

Specifically, this reference graph needs to be released in order to free up memory usage of the document:
-repo references synchronizer: CollectionSynchronizer
-synchronizer has record from documentId to docSynchronizer
-docSynchronizer has reference to DocHandle

Issue:
#424

Related issues:
#149
#358
#330

log(`removing document ${documentId}`)
const docSynchronizer = this.docSynchronizers[documentId]
if (docSynchronizer !== undefined) {
this.peers.forEach(peerId => docSynchronizer.endSync(peerId))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call endSync first to allow for any cleanup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that in DocSynchronizer, endSync only removes each peer. This doesn't fully cover the delete protocol as described in issue #149.

However, the deletes below do address the cache eviction and retained memory usage issues:
#330
#358

@georgewsu georgewsu marked this pull request as ready for review May 16, 2025 17:51
@pvh
Copy link
Member

pvh commented May 24, 2025

Hi @georgewsu -- thanks for the PR! I've been on holiday the last week and local-first conf is coming up this week so I'm going to be a bit time crunched for reviews but I see your patch and am looking forward to reviewing it soon.

@@ -75,4 +75,27 @@ describe("CollectionSynchronizer", () => {

setTimeout(done)
}))

it("removes document", () =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for including a test. @georgewsu can you think of a way we can test the memory usage, since that's a key motivation here?

@pvh
Copy link
Member

pvh commented Jun 1, 2025

Okay, I've given it a review & test and overall it seems good, but I think given that the primary motivation was freeing memory after delete I'd like to at least see if we can make some effort to demonstrate that it has the desired effect. If we can't automate it, @georgewsu, can you at least give me some steps I can run manually to validate that it's working?

@@ -798,8 +797,8 @@ export class Repo extends EventEmitter<RepoEvents> {
)
}
delete this.#handleCache[documentId]
// TODO: remove document from synchronizer when removeDocument is implemented
// this.synchronizer.removeDocument(documentId)
delete this.#progressCache[documentId]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added call to delete from progressCache

@georgewsu
Copy link
Contributor Author

georgewsu commented Jun 3, 2025

Okay, I've given it a review & test and overall it seems good, but I think given that the primary motivation was freeing memory after delete I'd like to at least see if we can make some effort to demonstrate that it has the desired effect. If we can't automate it, @georgewsu, can you at least give me some steps I can run manually to validate that it's working?

Sounds good. I updated https://github.com/georgewsu/automerge-client-test to reproduce memory usage before/after.
Steps:

  1. In automerge-client-test, update package.json to use a branch of automerge-repo patched with this PR.
  2. To see memory usage without removal from cache, run:
pnpm client -i 25 -s 1 -r local -c false
  1. To see memory usage with removal from cache, run:
pnpm client -i 25 -s 1 -r local -c true

Thanks!

@georgewsu
Copy link
Contributor Author

Confirmed steps to test from scratch:

git clone https://github.com/georgewsu/automerge-repo.git
cd automerge-repo
git checkout synchronizerRemoveDocument
pnpm install
pnpm build
cd ..
git clone https://github.com/georgewsu/automerge-client-test.git
cd automerge-client-test
git checkout local-automerge-repo
pnpm install
pnpm client -i 25 -s 1 -r local -c false
pnpm client -i 25 -s 1 -r local -c true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants