Client: Explore Multithreading Opportunities

Today I re-discovered all the problem sets and complexities coming up when thinking about how/where to multi-thread the client, and I wanted to give this some write-up, so that it's not necessary to re-discover again and again and can concentrate on broader or selective solutions.

### Technological Options

So there are basically three options on where to build upon: [Web Workers](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers) are a somewhat long (2009) existing web standard. Node.js does not support Web Workers (bun e.g. supports a subset), there is a [web-worker](https://www.npmjs.com/package/web-worker) package which might be able to somewhat mitigate.

Node.js comes with a [Worker Threads](https://nodejs.org/api/worker_threads.html) API, I think somewhat close to, but not compatible with Web Workers. Bun [supports](https://github.com/oven-sh/bun/pull/3923) worker threads. Browsers does not.

There is a very very heavily (7Mio Downloads per week) used library [workerpool](https://www.npmjs.com/package/workerpool), where the feature description sounds very promising (very small, works in Node + browser).

My current tendency would be to go and try with `workerpool`. As far as I got it I think though that the general design of these libraries + the structural changes needed for multithreading is for all solutions pretty similar. So I think it is not too dangerous respectively too much of a loss to choose 1 solution + be not 100% sure and eventually switch later. 90% of the work will nevertheless be able to just stay as it is.

### (Eventual) Technical Challenges / Limitations

#### Very Limited Data Exchange

There are strong limitations on what type of data can be exchanged between workers/threads and one basically needs to limit to basic/primitive data types (`string`) or some built-in JS low-level types (`ArrayBuffer`), also objects without functions. Passing functions or objects *with* functions is not allowed, see Node.js docs [here](https://nodejs.org/api/worker_threads.html#portpostmessagevalue-transferlist) or the respective direct [Structured Clone Algorithm docs](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm) (basically everything which can be structured-cloned can be used).

This has strong implications on the design space. Sharing Common? Not allowed. Passing a tx object? Not allowed. Sharing client config? Not allowed (in the current form).

#### Useful only for CPU-intense Tasks (not Networking e.g.)

Using threads seems only be useful for CPU-intense tasks, I've read this now from several sources, here is one [stack overflow answer](https://stackoverflow.com/questions/63724844/is-parallelization-of-network-requests-a-good-use-for-node-js-workers) especially on networking:

<img width="753" alt="grafik" src="https://github.com/user-attachments/assets/84b5acf0-c3b4-4f8d-aaf2-779b31be2918">

This is not a highly supported answer, at least marked as correct though (couldn't find a better source on this specific one quickly). So for now I would assume respectively it seems that e.g. taking out our networking layer/devp2p will likely not have the desired effect.

#### Multi-threaded DB Access

Ok, I also wanted to write something limitational here, but this might be the more positive part: I think I misread on previous rounds and did not make a proper process/thread distinction (I initially thought concurrent DB access from different workers is not possible). Seems so I am wrong and LevelDB e.g. is [thread-safe](https://github.com/Level/levelup?tab=readme-ov-file#multi-process-access).

### Potential Starting Points

#### Go Smaller

One way to approach might be to think a lot smaller on first round, and try to micro-threadify very singular tasks where a low amount of data is passed.For the `workerpool` libary there is dedicated functionality part where [functions can be offloaded dynamically](https://github.com/josdejong/workerpool?tab=readme-ov-file#offload-functions-dynamically). So this might be suited here.

A particularly good point to start here are likely the crytographic functions (so: the ones which run long), like signature verification, hashing, KZG, ...

#### Extract EVM Execution

While this might be a possible thing, this is a significantly bigger task and needs some several preparatory tasks/refactoring to get this handlable. Basically the "data-intertwistings" between the `vmexecution.ts` code and the rest of the client must be significantly and sustainably reduced and simplified, likely in several refactoring rounds.

Before we start on this we should likely go through this step-by-step and write down a separate "sub" issue on this respectively check for feasibility and how practical this remains, and identify the separate tasks necessary. Otherwise it might very well be possible that we ran into some dead end on step 9 from 12 and already have some significant amount of ressources wasted.

So, basically one needs to look closely into `VMExecution`, see what data is necessary there to both be passed in and get out and see if this can be practically changed to simple data not containing any functions.

Some steps which are likely possible:

- Encapsulating the client config data in a dict separate from the broader `Config` class, pass only that
- Pass only the "core" Common settings (and maybe also: combine this in `Common` as a combined dict), like `chain`, `hf`, `EIP` and use that

Note 1: These kind of goals actually also are strongly aligned with our tree shaking goals, so making the libraries lighter in itself. So if we go (in extreme case) "pure data" for Common (the thing passed around is basically only the 5 important different data points with no functions), the whole thing itself will get a lot lighter, which will help (e.g. performance) in a lot of places and also will help here. Same goes (in the extreme case) e.g. for txs and blocks.

Note 2: Since the EVM execution is *such* a heavy thing to separate (with potential very strong effects), it might (will likely) be worth to *add* some new additional steps (e.g. re-serialize a block to pass over the worker if necessary), since the stuff saved will very much outweight the additional costs.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Client: Explore Multithreading Opportunities #3695

Technological Options

(Eventual) Technical Challenges / Limitations

Very Limited Data Exchange

Useful only for CPU-intense Tasks (not Networking e.g.)

Multi-threaded DB Access

Potential Starting Points

Go Smaller

Extract EVM Execution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Client: Explore Multithreading Opportunities #3695

Description

Technological Options

(Eventual) Technical Challenges / Limitations

Very Limited Data Exchange

Useful only for CPU-intense Tasks (not Networking e.g.)

Multi-threaded DB Access

Potential Starting Points

Go Smaller

Extract EVM Execution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions