I'm using TBLIS in a system that has support for running multiple OpenMP runtimes within the same process, which is somewhat unusual. I'm tracking down some weird performance issues (StanfordLegion/legion#1266) when using TBLIS in this situation, and am wondering if there are some architectural issues within TBLIS (such as global state / locks) that could cause interference between independent TBLIS calls on these different OpenMP runtimes.
I'm using TBLIS in a system that has support for running multiple OpenMP runtimes within the same process, which is somewhat unusual. I'm tracking down some weird performance issues (StanfordLegion/legion#1266) when using TBLIS in this situation, and am wondering if there are some architectural issues within TBLIS (such as global state / locks) that could cause interference between independent TBLIS calls on these different OpenMP runtimes.