Skip to content

Conversation

@DonghakPark
Copy link
Member

[FSU] Inference FSU with Shared memory

To reduce memory usage during inference by utilizing FSU, and to minimize speed degradation by performing loading during forwarding, changed to use shared memory. and ensure the existing swap in training mode is also performed normally.

Commit 1 : [FSU] Update FSU Forwarding (Load) Logis

  • Change FSU Forwarding Logic ( Load weight with look ahead)

Commit 2 : [FSU] Update swap device & cache element

  • Update Swap Device's function to Support FSU (Inference)

Commit 3 : [FSU] Update FSU mem allocate Logic

  • Update Memory Allocation to Shared Mem

Commit 4 : [FSU] add FSU file offset info

  • Add Weight bin file offset that can pass to swap device

Commit 5 : [FSU] Apply Shared Mem & FSU

  • Update Logic to support both Inference Mode & Training Mode

This PR was include #2957 #2927 #2949, So i will close previous PRs

Copy link
Member

@SeoHyungjun SeoHyungjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has the formatting style changed?
I notice some differences despite no code modifications.

model_graph.LoadTensors(f);
model_graph.checkLoadComplete(f);
node->forwarding(training);
model_graph.UnloadTensors(f);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to unload now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, no longer need unload --> at unload logic there are 2 work : 1. dealloc, 2. write to file

  1. we use shared memory that allocated fixed amount of mem and reuse, so the allocated memory will dealloc at end of inference
  2. inference workload : no need to write at file

@DonghakPark
Copy link
Member Author

Has the formatting style changed? I notice some differences despite no code modifications.

Has the formatting style changed? --> No! Sorry for confuse there some conflicts on my local machine
i will fix formatting issues

@DonghakPark DonghakPark force-pushed the FSU_with_shared_Mem branch from fb045da to ab147da Compare March 4, 2025 05:22
@DonghakPark DonghakPark self-assigned this Mar 4, 2025
Copy link
Collaborator

@dkjung dkjung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

dkjung

This comment was marked as duplicate.

DonghakPark and others added 7 commits March 11, 2025 10:09
Update FSU forwarding logic
- FSU will handle look ahead tensor inside of pool
- so we don't need to call Loadtensor for f + i

**Self evaluation:**
1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
2. Run test:	 [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <[email protected]>
Add memory ptr for allocate shared mem
- add mem_ptr
- add unmap - array for manage unmapped ptr

**Self evaluation:**
1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
2. Run test:	 [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <[email protected]>
I have changed the method from using dynamic memory allocation to using static memory allocation.
In order to prevent multiple frees, I added a map to check whether the mem_address has already been processed. Previously, memory was allocated through buf, but now it is being allocated directly.

**Self evaluation:**
1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
2. Run test:	 [X]Passed [ ]Failed [ ]Skipped

Co-authored-by: jijoong.moon <[email protected]>
Signed-off-by: Donghak PARK <[email protected]>
make neuralnet can pass path to the swap_device & weight offset (file offset)
it can make calculate weight file's offset

**Self evaluation:**
1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
2. Run test:	 [X]Passed [ ]Failed [ ]Skipped

Co-authored-by: hyeonseok <[email protected]>
Signed-off-by: Donghak PARK <[email protected]>
Apply Shared mem & FSU
- when inference mode : read from weight bin ( weight offset )
- when train mode : same logic with swap

**Self evaluation:**
1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
2. Run test:	 [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <[email protected]>
Fix Unittest Fail bug at Training Case Swap
- There are some issue on PutBuffer that can not free ptr

**Self evaluation:**
1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
2. Run test:	 [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <[email protected]>
Apply clang format at changed File

**Self evaluation:**
1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
2. Run test:	 [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <[email protected]>
@DonghakPark DonghakPark force-pushed the FSU_with_shared_Mem branch from ab147da to ef465db Compare March 11, 2025 01:10
Update FSU Unitter
- For now, we should set our weight & input size as pagesize * N
- For later i will add Page Align Algorithm

**Self evaluation:**
1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
2. Run test:	 [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <[email protected]>
@DonghakPark
Copy link
Member Author

This PR include at #3004. --> close

@DonghakPark DonghakPark deleted the FSU_with_shared_Mem branch March 20, 2025 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants