perf: add parallelization#34
Conversation
96868c0 to
0658cc6
Compare
|
@cmdoret , @supermaxiste : the |
fix: correct all clippy errors fix: add CLI argument to setup the thread pool fix: CLI handling fix: wrong argument docs(readme): add copyright notice (#35) fix: parallelize also second pass fix: correct output in `write` -> `write_all` - `write` may fail an return the number of bytes written, which is the wrong function. fix: remove from first-pass since no benefit
8c15fb0 to
51cd62f
Compare
|
@gabyx : This PR is draft, and should be revisited once parallelization is needed with longer work on the pseudonomization function |
|
We can revisit this idea with a different approach. We can break down tripsu's pipeline as follows:
The crate crossbeam can help implement the approach above and basically multithread on the pseudonymization process for all triples that require it. This approach seems scalable and it also keeps copy-on-write performance gains by simply skipping the worker pool when not needed 👍 |
Proposed Changes
Add parallelization over
rayonwhich lets you convert an iterator to aParallelIteratorwhichis parallelized over
rayon's thread pool.Note:
rio'sinto_iteratoris abit stupid written, dont see the intention to store a fullVec<T>Add a CLI flag
--parallelwhich will use parallelizationIf the parallelization pays out has yet to be tested. The
Mutexon the Writer is not particularly good I guess. -> useslogs async writter which is buffered and isSend + Sync...Types of Changes
What types of changes does your code introduce? Put an
xin the boxes thatapply
functionality to not work as expected).
other choices apply).
Checklist
Put an
xin the boxes that apply. You can also fill these out after creatingthe PR. If you're unsure about any of them, don't hesitate to ask. We're here to
help! This is simply a reminder of what we are going to look for before merging
your code.
CONTRIBUTING
guidelines.
works.
Further Comments