Open
Description
I have observed that halide_*_initialize_kernels()
always acquires a lock, even if the object already exists. When multiple threads call into the runtime concurrently, initialize_kernels()
keeps getting called, even when the void** state_ptr
has already been initialized. This turns out to be a fairly obtuse contention point, which can be eliminated by atomically loading state_ptr
first and checking if it is null (and thus actually require initialization.