Skip to content

Read back data written in cuda shared memory in python #816

Open
@MaximeDebarbat

Description

@MaximeDebarbat

I am developing a micro-services app which relies on triton inference server. I send data to triton inference server by writing data directly in a cuda handler but now I wish to get it back in another process, exactly like the way Triton would retrieve it.
Here is an example of how I write my data:

import torch
from cuda_shared_memory import create_shared_memory_region, set_shared_memory_region_from_dlpack, get_raw_handle
import time
import os


if __name__ == "__main__":
    shm_name = "tensor_shm"
    device_id = 0

    # Example tensor
    tensor = torch.randn((3, 3), dtype=torch.float32).cuda(device_id)
    byte_size = tensor.element_size() * tensor.numel()

    handle = create_shared_memory_region(shm_name, byte_size=byte_size, device_id=device_id)
    set_shared_memory_region_from_dlpack(handle, [tensor])

    ser = get_raw_handle(handle)

    with open(os.path.join("/shm-dir", "address"), "wb") as f:
        f.write(ser)
        
    time.sleep(100)

is there anywhere an example I couldn't perhaps find that would explain me how to read this handler in another process by any chance ? Thanks a lot!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions