-
Notifications
You must be signed in to change notification settings - Fork 94
[FSU] Inference FSU with Shared memory #2969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ec044b7 to
d328785
Compare
e41af8c to
fb045da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has the formatting style changed?
I notice some differences despite no code modifications.
| model_graph.LoadTensors(f); | ||
| model_graph.checkLoadComplete(f); | ||
| node->forwarding(training); | ||
| model_graph.UnloadTensors(f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to unload now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, no longer need unload --> at unload logic there are 2 work : 1. dealloc, 2. write to file
- we use shared memory that allocated fixed amount of mem and reuse, so the allocated memory will dealloc at end of inference
- inference workload : no need to write at file
Has the formatting style changed? --> No! Sorry for confuse there some conflicts on my local machine |
fb045da to
ab147da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Update FSU forwarding logic - FSU will handle look ahead tensor inside of pool - so we don't need to call Loadtensor for f + i **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
Add memory ptr for allocate shared mem - add mem_ptr - add unmap - array for manage unmapped ptr **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
I have changed the method from using dynamic memory allocation to using static memory allocation. In order to prevent multiple frees, I added a map to check whether the mem_address has already been processed. Previously, memory was allocated through buf, but now it is being allocated directly. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Co-authored-by: jijoong.moon <[email protected]> Signed-off-by: Donghak PARK <[email protected]>
make neuralnet can pass path to the swap_device & weight offset (file offset) it can make calculate weight file's offset **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Co-authored-by: hyeonseok <[email protected]> Signed-off-by: Donghak PARK <[email protected]>
Apply Shared mem & FSU - when inference mode : read from weight bin ( weight offset ) - when train mode : same logic with swap **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
Fix Unittest Fail bug at Training Case Swap - There are some issue on PutBuffer that can not free ptr **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
Apply clang format at changed File **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
ab147da to
ef465db
Compare
Update FSU Unitter - For now, we should set our weight & input size as pagesize * N - For later i will add Page Align Algorithm **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>
ef465db to
69b8783
Compare
|
This PR include at #3004. --> close |
[FSU] Inference FSU with Shared memory
To reduce memory usage during inference by utilizing FSU, and to minimize speed degradation by performing loading during forwarding, changed to use shared memory. and ensure the existing swap in training mode is also performed normally.
Commit 1 : [FSU] Update FSU Forwarding (Load) Logis
Commit 2 : [FSU] Update swap device & cache element
Commit 3 : [FSU] Update FSU mem allocate Logic
Commit 4 : [FSU] add FSU file offset info
Commit 5 : [FSU] Apply Shared Mem & FSU
This PR was include #2957 #2927 #2949, So i will close previous PRs