We have an application that uses CullingThreadpool where we are observing a crash when resizing our window and subsequently calling SetResolution() - usually many times in fast succession. Based on debugging and logging I suspect that there is a race condition between Flush()/mRenderQueue->IsPipelineEmpty() and the worker threads where they pick up a job after the call to Flush() but before any new work is queued. When this condition happens, I observe that in IsPipelineEmpty() that the result of GetMinRenderPtr() is equal to mWritePtr and subsequently the call to Reset() will reset the mWritePtr to zero, however then I get a crash in the debugger where the worker thread has an active job (using now deleted memory). The crash is always on access of mMaskedHiZBuffer. I tried a few mitigating actions such as converting volatile fields in RenderJobQueue to std::atomic_uint but could not nail down a root cause. The crash happens regardless of which algorithm we use (SSE, AVX), or if we recreate the MOC object before every setResolution(). The only fix we found that works is to SuspendThreads() before calling setResolution() and then WakeThreads() afterwards. This will kick the workers our of their work loop and allow the queues to be cleared before starting up again guaranteeing a fresh start.
I post this as a warning to other developers, and also to hopefully generate some discussion from other developers who may have run into the same issue or may have better insight or a better solution.
We have an application that uses
CullingThreadpoolwhere we are observing a crash when resizing our window and subsequently callingSetResolution()- usually many times in fast succession. Based on debugging and logging I suspect that there is a race condition betweenFlush()/mRenderQueue->IsPipelineEmpty()and the worker threads where they pick up a job after the call toFlush()but before any new work is queued. When this condition happens, I observe that inIsPipelineEmpty()that the result ofGetMinRenderPtr()is equal tomWritePtrand subsequently the call toReset()will reset themWritePtrto zero, however then I get a crash in the debugger where the worker thread has an active job (using now deleted memory). The crash is always on access ofmMaskedHiZBuffer. I tried a few mitigating actions such as convertingvolatilefields inRenderJobQueuetostd::atomic_uintbut could not nail down a root cause. The crash happens regardless of which algorithm we use (SSE, AVX), or if we recreate the MOC object before everysetResolution(). The only fix we found that works is toSuspendThreads()before callingsetResolution()and thenWakeThreads()afterwards. This will kick the workers our of their work loop and allow the queues to be cleared before starting up again guaranteeing a fresh start.I post this as a warning to other developers, and also to hopefully generate some discussion from other developers who may have run into the same issue or may have better insight or a better solution.