Tritonclient CUDA shared memory set_shared_memory_region_from_dlpack fails with non-set stream for Torch tensor. #789
Open
Description
Code sample:
import tritonshmutils.cuda_shared_memory as cshm
import torch
shm_op0_handle = cshm.create_shared_memory_region("dummy_data", 8, 0)
data = torch.tensor([1, 2], dtype=torch.float32).cuda()
dlpack_data = torch.to_dlpack(data)
cshm.set_shared_memory_region_from_dlpack(
shm_op0_handle, [dlpack_data]
)
Log:
/usr/local/lib/python3.10/dist-packages/tritonshmutils/__init__.py:33: DeprecationWarning: The package `tritonshmutils` is deprecated and will be removed in a future version. Please use instead `tritonclient.utils`
warnings.warn(
/usr/local/lib/python3.10/dist-packages/tritonshmutils/cuda_shared_memory.py:33: DeprecationWarning: The package `tritonshmutils.cuda_shared_memory` is deprecated and will be removed in a future version. Please use instead `tritonclient.utils.cuda_shared_memory`
warnings.warn(
/usr/local/lib/python3.10/dist-packages/tritonclient/utils/cuda_shared_memory/__init__.py:45: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
import pkg_resources
/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py:3138: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('zope')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py:3138: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('zope')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
Cell In[1], line 7
5 data = torch.tensor([1, 2], dtype=torch.float32).cuda()
6 dlpack_data = torch.to_dlpack(data)
----> 7 cshm.set_shared_memory_region_from_dlpack(
8 shm_op0_handle, [dlpack_data]
9 )
File /usr/local/lib/python3.10/dist-packages/tritonclient/utils/cuda_shared_memory/__init__.py:348, in set_shared_memory_region_from_dlpack(cuda_shm_handle, input_values)
343 stream = _get_or_create_global_cuda_stream(0)
344 # Knowing the implementation detail of how shared memory region is
345 # set (cudaMemcpy). There is no need to transfer ownership of
346 # 'dl_managed_tensor': the data has been copied out when dlpack
347 # capsule is out of scope.
--> 348 dlcapsule = _dlpack.get_dlpack_capsule(input_value, stream.getPtr())
349 dmt = _dlpack.get_managed_tensor(dlcapsule)
350 if not _dlpack.is_contiguous_data(
351 dmt.dl_tensor.ndim, dmt.dl_tensor.shape, dmt.dl_tensor.strides
352 ):
UnboundLocalError: local variable 'stream' referenced before assignment
Version of client:
tritonclient 2.49.0
Metadata
Assignees
Labels
No labels