-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem? Please describe.
- Currently LMCache P2PBackend leverages CPU allocator as a staging buffer for P2P transfer, this can incur an extra copy step on both hosts. Perhaps we can allow configurable allocator, and utilize the gpu allocator to use the on device buffer as staging, mitigating the extra copy on the remote host at the very least.
Describe the solution you'd like
- For Ascend, we should first test the P2Pbackend whether using the PagedCPUGPUMemoryAllocator allow the HCCL transfer channel to directly execute the D2D transport succesfully.
- For LMCache modification, we should create an appropriate PR that accomodate this feature.
Describe alternatives you've considered
- N/A
Additional context
- Utilizing on device buffer consumes HBM memory, we should give advice on how much buffer is desirable.
- This is only the first step for such direct D2D RDMA transfer.
- During the development, we should also take a look at whether independent parallelism strategy can be satisfied.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request