Read back data written in cuda shared memory in python #816
Open
Description
I am developing a micro-services app which relies on triton inference server. I send data to triton inference server by writing data directly in a cuda handler but now I wish to get it back in another process, exactly like the way Triton would retrieve it.
Here is an example of how I write my data:
import torch
from cuda_shared_memory import create_shared_memory_region, set_shared_memory_region_from_dlpack, get_raw_handle
import time
import os
if __name__ == "__main__":
shm_name = "tensor_shm"
device_id = 0
# Example tensor
tensor = torch.randn((3, 3), dtype=torch.float32).cuda(device_id)
byte_size = tensor.element_size() * tensor.numel()
handle = create_shared_memory_region(shm_name, byte_size=byte_size, device_id=device_id)
set_shared_memory_region_from_dlpack(handle, [tensor])
ser = get_raw_handle(handle)
with open(os.path.join("/shm-dir", "address"), "wb") as f:
f.write(ser)
time.sleep(100)
is there anywhere an example I couldn't perhaps find that would explain me how to read this handler in another process by any chance ? Thanks a lot!
Metadata
Assignees
Labels
No labels