Large enough messages block the main thread for a long time, which can lead to socket timeouts on blocked clients

Hello again :)

I wanted to post some performance findings:

I have a roughly 10MB JSON file I'm parsing and creating a new automerge document from. Here's the size of the automerge document on disk:

Using next's `Automerge.from`: 1.6MB
Using next + `Automerge.RawString` for all strings: 744KB
Using stable `Automerge.from` (which from my understanding is using `Automerge.Text` for strings): 744KB

This is quite impressive!

Now, on to automerge-repo. Along with @georgewsu, I've been trying to stress test automerge-repo via the sync server.

What @georgewsu and I are trying to measure is the time it takes for the sync server to receive such a large document, and how long it takes another client to fetch it immediately after. I've created a [to reproduce the findings](https://github.com/kid-icarus/automerge-client-test/blob/file-test/src/client.ts). I've shared an example JSON payload with @alexjg, but I can't post it publicly. It's a ton of nested maps with a fair amount of strings in the maps. If there's anything I can do to improve the signal we're getting from the tests, or if I've made any mistakes in the tests that would lead to abnormal, suboptimal performance, I'd love to hear any ways I could improve them. Additionally, I'd be happy to contribute these types of tests/benchmarks back in a structured fashion so that folks can improve on existing limitations.

That said, with the payload used, the sync server took roughly 46 seconds to receive the document. While receiving the document, it blocked the main thread, severely limiting concurrency (other clients encountered socket timeouts). I pinpointed it to a call to [A.receiveSyncMessage](https://github.com/automerge/automerge-repo/blob/main/packages/automerge-repo/src/synchronizer/DocSynchronizer.ts#L354-L358) in the doc synchronizer. A deeper analysis shows calls into wasm; notably, calls to `applyPatches` comprise most of the time spent. Here's a CPU profile you can load into Chrome dev tools if you'd like to look at my findings [CPU-20240411T110711.cpuprofile](https://github.com/automerge/automerge-repo/files/14951613/CPU-20240411T110711.cpuprofile)

The fact that the main thread is blocked is problematic regarding concurrency. I'm just brainstorming ideas here, but would it make sense to throw that call to `A.receiveSyncMessage` in a separate thread? I know that automerge-repo is intended to work in the browser and node and that isomorphic threading can be a pain (web workers vs. worker threads). Also, the overhead of sending the data to/from the worker without a SharedArrayBuffer might impact potential performance gains. Still, I would see the benefit being that, so long all CPU cores aren't busy receiving huge docs (I imagine an infrequent use case), at least a few huge docs could be received while other cores happily synchronize more minor edits.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Large enough messages block the main thread for a long time, which can lead to socket timeouts on blocked clients #339

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Large enough messages block the main thread for a long time, which can lead to socket timeouts on blocked clients #339

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions