Skip to content

Commit 68f320f

Browse files
makortelfwyzard
authored andcommitted
Move mutex unlock after the cudaEventRecord() in DeviceFree() (#240)
This commit fixes a race for the CUDA event between DeviceFree() and DeviceAllocate(). If the mutex is unlocked before the cudaEventRecord(), there is a short period of time when - the memory block is already in the free list (cached_blocks), and - the CUDA event status is not yet cudaErrorNotReady and the DeviceAllocate() may consider that memory block to be free to be used for another CUDA stream.
1 parent 096e0e3 commit 68f320f

File tree

1 file changed

+5
-4
lines changed

1 file changed

+5
-4
lines changed

HeterogeneousCore/CUDAServices/src/CachingDeviceAllocator.h

+5-4
Original file line numberDiff line numberDiff line change
@@ -586,9 +586,6 @@ struct CachingDeviceAllocator
586586
}
587587
}
588588

589-
// Unlock
590-
mutex.Unlock();
591-
592589
// First set to specified device (entrypoint may not be set)
593590
if (device != entrypoint_device)
594591
{
@@ -601,7 +598,11 @@ struct CachingDeviceAllocator
601598
// Insert the ready event in the associated stream (must have current device set properly)
602599
if (CubDebug(error = cudaEventRecord(search_key.ready_event, search_key.associated_stream))) return error;
603600
}
604-
else
601+
602+
// Unlock
603+
mutex.Unlock();
604+
605+
if (!recached)
605606
{
606607
// Free the allocation from the runtime and cleanup the event.
607608
if (CubDebug(error = cudaFree(d_ptr))) return error;

0 commit comments

Comments
 (0)