Skip to content

Possible race condition in CullingThreadpool #27

@kurtzmarc

Description

@kurtzmarc

We have an application that uses CullingThreadpool where we are observing a crash when resizing our window and subsequently calling SetResolution() - usually many times in fast succession. Based on debugging and logging I suspect that there is a race condition between Flush()/mRenderQueue->IsPipelineEmpty() and the worker threads where they pick up a job after the call to Flush() but before any new work is queued. When this condition happens, I observe that in IsPipelineEmpty() that the result of GetMinRenderPtr() is equal to mWritePtr and subsequently the call to Reset() will reset the mWritePtr to zero, however then I get a crash in the debugger where the worker thread has an active job (using now deleted memory). The crash is always on access of mMaskedHiZBuffer. I tried a few mitigating actions such as converting volatile fields in RenderJobQueue to std::atomic_uint but could not nail down a root cause. The crash happens regardless of which algorithm we use (SSE, AVX), or if we recreate the MOC object before every setResolution(). The only fix we found that works is to SuspendThreads() before calling setResolution() and then WakeThreads() afterwards. This will kick the workers our of their work loop and allow the queues to be cleared before starting up again guaranteeing a fresh start.

I post this as a warning to other developers, and also to hopefully generate some discussion from other developers who may have run into the same issue or may have better insight or a better solution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions