Adding a mutex lock to set_range function #4207
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Context:
While we expose KVTensor to external surfaces (i.e., checkpointing), they have the freedom to leverage the KVTensor functions in a concurrent fashion.
For example,
https://www.internalfb.com/code/fbsource/[5b7b1eef7d69]/fbcode/aiplatform/modelstore/checkpointing/pyper/TensorLoaderCallback.h?lines=85-86
This function here calls set_range to the same KVTensor multiple times because we divide a huge chunk of data into smaller chunks and try to write it in a concurrent fashion. This is a bad practice because in SSD I/O, We also use multi threading to write data in KVTensor.
Currently, we use 32 threads (each thread per shard) to write data. Due to this, when we call set_range multiple times, this can lead to thread contention and increase in synchronization overhead
In this Diff:
We introduce a mutex lock on the set_range function, due to this every transaction is locked during execution and the multiple calls are processed serially leading to more efficient use of the threads
Differential Revision: D75555658