Problem
With #1069, the C++ BufferResource lifetime is now maintained at the C++ level. Any downstream consumer holding a deep-copied any_resource<> from br.device_mr keeps the underlying C++ BufferResource alive.
However, this is still insufficient for Python-created BufferResources. The C++ object only borrows its stream pool, device MR, pinned MR, and statistics objects. Those resources are actually owned by Python and kept alive through self._<thing> attributes on the Python wrapper.
If the Python BufferResource wrapper is garbage collected while a downstream object still holds device_mr, the C++ BufferResource survives, but its borrowed dependencies are destroyed underneath it. The next allocation can therefore trigger a use-after-free.
This is the lifetime issue tracked in #641.
Proposed approach
1. Stream pool: move ownership into C++
The C++ constructor already supports owning the stream pool through a normal
shared_ptr deleter, for example the default:
std::make_shared<rmm::cuda_stream_pool>(16, ...)
from:
cpp/include/rapidsmpf/memory/buffer_resource.hpp:91
We should therefore remove the Python-side borrowed stream pool entirely.
Replace the Python constructor argument:
stream_pool: CudaStreamPool
with:
and expose streams through:
BufferResource.get_stream()
implemented as:
Stream._from_cudaStream_t(view.value(), owner=self)
The owner=self reference ensures the Python BufferResource wrapper stays alive
for the lifetime of any handed-out stream.
2. Device MR, pinned MR, and statistics: Python-aware deleter
These resources are user-supplied, so ownership must remain on the Python side.
Wrap each borrowed resource in a std::shared_ptr whose deleter holds a strong
Python reference:
Py_INCREF when constructing the control block
Py_DECREF under the GIL when destroying it
The resulting C++ control block then transitively pins the Python wrapper for as
long as any downstream object still holds the associated resource.
Apply this pattern to all three handles passed into the C++ BufferResource:
- device MR
- pinned MR
- statistics
For some more context, see @pentschev's comment: #1069 (comment)
Problem
With #1069, the C++
BufferResourcelifetime is now maintained at the C++ level. Any downstream consumer holding a deep-copiedany_resource<>frombr.device_mrkeeps the underlying C++BufferResourcealive.However, this is still insufficient for Python-created
BufferResources. The C++ object only borrows its stream pool, device MR, pinned MR, and statistics objects. Those resources are actually owned by Python and kept alive throughself._<thing>attributes on the Python wrapper.If the Python
BufferResourcewrapper is garbage collected while a downstream object still holdsdevice_mr, the C++BufferResourcesurvives, but its borrowed dependencies are destroyed underneath it. The next allocation can therefore trigger a use-after-free.This is the lifetime issue tracked in #641.
Proposed approach
1. Stream pool: move ownership into C++
The C++ constructor already supports owning the stream pool through a normal
shared_ptrdeleter, for example the default:std::make_shared<rmm::cuda_stream_pool>(16, ...)from:
We should therefore remove the Python-side borrowed stream pool entirely.
Replace the Python constructor argument:
with:
and expose streams through:
implemented as:
The
owner=selfreference ensures the PythonBufferResourcewrapper stays alivefor the lifetime of any handed-out stream.
2. Device MR, pinned MR, and statistics: Python-aware deleter
These resources are user-supplied, so ownership must remain on the Python side.
Wrap each borrowed resource in a
std::shared_ptrwhose deleter holds a strongPython reference:
Py_INCREFwhen constructing the control blockPy_DECREFunder the GIL when destroying itThe resulting C++ control block then transitively pins the Python wrapper for as
long as any downstream object still holds the associated resource.
Apply this pattern to all three handles passed into the C++
BufferResource:For some more context, see @pentschev's comment: #1069 (comment)