Adds view_ptr, view_ptr_mut, chunks_exact, chunks_exact_mut#577
Conversation
chelsea0x3b
commented
May 15, 2026
- CudaView::view_ptr is similar to DevicePtr::device_ptr, except it returns a SyncOnDrop with the CudaView's lifetime
- CudaViewMut::view_ptr_mut is similar to DevicePtrMut::device_ptr_mut, except it returns a SyncOnDrop with the CudaViewMut's lifetime
- Also adding CudaView::chunks_exact and CudaViewMut::chunks_exact_mut
There was a problem hiding this comment.
Code Review
This pull request introduces view_ptr and view_ptr_mut methods for CudaView and CudaViewMut to facilitate raw device pointer retrieval with integrated synchronization. It also adds chunks_exact and chunks_exact_mut methods for segmenting CUDA views into iterators. The review feedback highlights that the chunks_exact implementations are overly restrictive due to a modulo assertion that deviates from Rust's standard library conventions. Additionally, it is recommended to refine the lifetime bounds on the chunks_exact iterator to improve API flexibility by capturing fields by value.
| pub fn chunks_exact(&self, chunk_size: usize) -> impl Iterator<Item = CudaView<'a, T>> + '_ { | ||
| assert!(self.len % chunk_size == 0); | ||
| let num_chunks = self.len / chunk_size; | ||
| (0..num_chunks).map(move |i| self.resize(i * chunk_size, (i + 1) * chunk_size)) | ||
| } |
There was a problem hiding this comment.
The assert!(self.len % chunk_size == 0) makes this method significantly more restrictive than std::slice::ChunksExact, which yields chunks of the requested size and ignores the remainder. This deviation from standard library behavior can lead to unexpected panics. It is recommended to remove this assertion to align with Rust's std conventions.
Additionally, the + '_ lifetime bound on the return type unnecessarily ties the iterator to the lifetime of the CudaView struct. By capturing the fields of self by value, the iterator can yield CudaView<'a, T> independently of the local view's lifetime, improving API flexibility.
pub fn chunks_exact(&self, chunk_size: usize) -> impl Iterator<Item = CudaView<'a, T>> {
let num_chunks = self.len / chunk_size;
let ptr = self.ptr;
let read = self.read;
let write = self.write;
let stream = self.stream;
(0..num_chunks).map(move |i| CudaView {
ptr: ptr + (i * chunk_size * std::mem::size_of::<T>()) as u64,
len: chunk_size,
read,
write,
stream,
marker: PhantomData,
})
}| pub fn chunks_exact_mut(self, chunk_size: usize) -> impl Iterator<Item = CudaViewMut<'a, T>> { | ||
| assert!(self.len % chunk_size == 0); | ||
| let num_chunks = self.len / chunk_size; | ||
| (0..num_chunks).map(move |i| CudaViewMut { | ||
| ptr: self.ptr + (i * chunk_size * std::mem::size_of::<T>()) as u64, | ||
| len: chunk_size, | ||
| read: self.read, | ||
| write: self.write, | ||
| stream: self.stream, | ||
| marker: PhantomData, | ||
| }) | ||
| } |
There was a problem hiding this comment.
Similar to chunks_exact, the assert!(self.len % chunk_size == 0) is unnecessarily restrictive and deviates from the behavior of std::slice::ChunksExactMut. Removing this assertion would make the API more robust and consistent with Rust's standard library expectations.
pub fn chunks_exact_mut(self, chunk_size: usize) -> impl Iterator<Item = CudaViewMut<'a, T>> {
let num_chunks = self.len / chunk_size;
(0..num_chunks).map(move |i| CudaViewMut {
ptr: self.ptr + (i * chunk_size * std::mem::size_of::<T>()) as u64,
len: chunk_size,
read: self.read,
write: self.write,
stream: self.stream,
marker: PhantomData,
})
}