Skip to content

Fix thread-safety issue in ONNX to Linalg conversion#3371

Open
kimm240 wants to merge 7 commits intoonnx:mainfrom
kimm240:feature/multithread
Open

Fix thread-safety issue in ONNX to Linalg conversion#3371
kimm240 wants to merge 7 commits intoonnx:mainfrom
kimm240:feature/multithread

Conversation

@kimm240
Copy link
Contributor

@kimm240 kimm240 commented Jan 23, 2026

Problem

The convert-onnx-to-linalg pass was experiencing intermittent crashes.
When Command to check run, Log which should be made should be always made,
but Error Log was made sometimes.

  • Command to check, Log which should be made, Error Log are in below

This happens because, shouldConvertToLinalg() function used static variables:

  • static std::string cachedLinalgOps
  • static std::unique_ptr<EnableByRegexOption> linalgOpsMatcher

When MLIR's pattern rewrite driver runs patterns in parallel (via applyPatternsGreedily), multiple threads simultaneously access these static variables, causing data races.

Command to check

./build/Release/bin/onnx-mlir-opt --convert-onnx-to-linalg='linalg-ops=MatMul' test/mlir/conversion/onnx_to_linalg/Math/MatMul.mlir

Log which should be made

module {
  func.func @test_matmul_2d(%arg0: tensor<2x3xf32>, %arg1: tensor<3x4xf32>) -> tensor<2x4xf32> {
    %cst = arith.constant 0.000000e+00 : f32
    %0 = tensor.empty() : tensor<2x4xf32>
    %1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<2x4xf32>) -> tensor<2x4xf32>
    %2 = linalg.matmul ins(%arg0, %arg1 : tensor<2x3xf32>, tensor<3x4xf32>) outs(%1 : tensor<2x4xf32>) -> tensor<2x4xf32>
    return %2 : tensor<2x4xf32>
  }
  func.func @test_matmul_different_sizes(%arg0: tensor<5x10xf32>, %arg1: tensor<10x3xf32>) -> tensor<5x3xf32> {
    %cst = arith.constant 0.000000e+00 : f32
    %0 = tensor.empty() : tensor<5x3xf32>
    %1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<5x3xf32>) -> tensor<5x3xf32>
    %2 = linalg.matmul ins(%arg0, %arg1 : tensor<5x10xf32>, tensor<10x3xf32>) outs(%1 : tensor<5x3xf32>) -> tensor<5x3xf32>
    return %2 : tensor<5x3xf32>
  }
  func.func @test_matmul_square(%arg0: tensor<4x4xf32>, %arg1: tensor<4x4xf32>) -> tensor<4x4xf32> {
    %cst = arith.constant 0.000000e+00 : f32
    %0 = tensor.empty() : tensor<4x4xf32>
    %1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<4x4xf32>) -> tensor<4x4xf32>
    %2 = linalg.matmul ins(%arg0, %arg1 : tensor<4x4xf32>, tensor<4x4xf32>) outs(%1 : tensor<4x4xf32>) -> tensor<4x4xf32>
    return %2 : tensor<4x4xf32>
  }
  func.func @test_matmul_3d_batch_not_lowered(%arg0: tensor<2x3x4xf32>, %arg1: tensor<2x4x5xf32>) -> tensor<2x3x5xf32> {
    %0 = "onnx.MatMul"(%arg0, %arg1) : (tensor<2x3x4xf32>, tensor<2x4x5xf32>) -> tensor<2x3x5xf32>
    return %0 : tensor<2x3x5xf32>
  }
  func.func @test_matmul_1d_2d_not_lowered(%arg0: tensor<3xf32>, %arg1: tensor<3x4xf32>) -> tensor<4xf32> {
    %0 = "onnx.MatMul"(%arg0, %arg1) : (tensor<3xf32>, tensor<3x4xf32>) -> tensor<4xf32>
    return %0 : tensor<4xf32>
  }
  func.func @test_matmul_2d_1d_not_lowered(%arg0: tensor<2x3xf32>, %arg1: tensor<3xf32>) -> tensor<2xf32> {
    %0 = "onnx.MatMul"(%arg0, %arg1) : (tensor<2x3xf32>, tensor<3xf32>) -> tensor<2xf32>
    return %0 : tensor<2xf32>
  }
}

Error Log

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug.
 #0 0x00005f82a9454c30 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (./build/Release/bin/onnx-mlir-opt+0x38f6c30)
 #1 0x00005f82a94518ef llvm::sys::RunSignalHandlers() (./build/Release/bin/onnx-mlir-opt+0x38f38ef)
 #2 0x00005f82a9451a42 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #3 0x00007a090ce42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #4 0x00007a090d2d8c13 local_Rb_tree_increment /home/conda/feedstock_root/build_artifacts/gcc_compilers_1765252935691/work/build/x86_64-conda-linux-gnu/libstdc++-v3/src/c++98/../../../../../libstdc++-v3/src/c++98/tree.cc:65:21
 #5 0x00007a090d2d8c13 std::_Rb_tree_increment(std::_Rb_tree_node_base const*) /home/conda/feedstock_root/build_artifacts/gcc_compilers_1765252935691/work/build/x86_64-conda-linux-gnu/libstdc++-v3/src/c++98/../../../../../libstdc++-v3/src/c++98/tree.cc:91:35
 #6 0x00005f82a6e71c30 onnx_mlir::EnableByRegexOption::isEnabled(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (./build/Release/bin/onnx-mlir-opt+0x1313c30)
 #7 0x00005f82a6719f4d onnx_mlir::shouldConvertToLinalg(mlir::Operation*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) (./build/Release/bin/onnx-mlir-opt+0xbbbf4d)
 #8 0x00005f82a671a271 onnx_mlir::ONNXMatMulOpLoweringToLinalg::matchAndRewrite(mlir::ONNXMatMulOp, mlir::PatternRewriter&) const (./build/Release/bin/onnx-mlir-opt+0xbbc271)
 #9 0x00005f82a8077831 mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>)::'lambda'()::operator()() const PatternApplicator.cpp:0:0
#10 0x00005f82a8078c27 mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>) (./build/Release/bin/onnx-mlir-opt+0x251ac27)
#11 0x00005f82a8028e60 (anonymous namespace)::GreedyPatternRewriteDriver::processWorklist() GreedyPatternRewriteDriver.cpp:0:0
#12 0x00005f82a802dadb mlir::applyPatternsGreedily(mlir::Region&, mlir::FrozenRewritePatternSet const&, mlir::GreedyRewriteConfig, bool*) (./build/Release/bin/onnx-mlir-opt+0x24cfadb)
#13 0x00005f82a6717f9c onnx_mlir::(anonymous namespace)::ConvertONNXToLinalgPass::runOnOperation() ConvertONNXToLinalg.cpp:0:0
#14 0x00005f82a8d12669 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (./build/Release/bin/onnx-mlir-opt+0x31b4669)
#15 0x00005f82a8d12ab4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (./build/Release/bin/onnx-mlir-opt+0x31b4ab4)
#16 0x00005f82a8d13246 mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::'lambda12'(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)::operator()(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&) const Pass.cpp:0:0
#17 0x00005f82a8d1342f std::_Function_handler<void (), llvm::LogicalResult mlir::failableParallelForEach<__gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, void mlir::parallelForEach<__gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::'lambda12'(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)>(mlir::MLIRContext*, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::'lambda12'(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)&&)::'lambda'(__gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >&&)>(mlir::MLIRContext*, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::'lambda12'(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)&&)::'lambda'()>::_M_invoke(std::_Any_data const&) Pass.cpp:0:0
#18 0x00005f82a5e46291 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<std::function<void ()> > >, void> >::_M_invoke(std::_Any_data const&) (./build/Release/bin/onnx-mlir-opt+0x2e8291)
#19 0x00005f82a5e47e1d std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) (./build/Release/bin/onnx-mlir-opt+0x2e9e1d)
#20 0x00007a090ce99ee8 __pthread_once_slow ./nptl/./nptl/pthread_once.c:118:7
#21 0x00005f82a5e4818b std::__future_base::_Deferred_state<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>::_M_complete_async() (./build/Release/bin/onnx-mlir-opt+0x2ea18b)
#22 0x00005f82a5e6a397 std::_Function_handler<void (), std::shared_future<void> llvm::ThreadPoolInterface::asyncImpl<void>(std::function<void ()>, llvm::ThreadPoolTaskGroup*)::'lambda'()>::_M_invoke(std::_Any_data const&) (./build/Release/bin/onnx-mlir-opt+0x30c397)
#23 0x00005f82a93e3cae llvm::StdThreadPool::processTasks(llvm::ThreadPoolTaskGroup*) (./build/Release/bin/onnx-mlir-opt+0x3885cae)
#24 0x00005f82a93e4b17 void* llvm::thread::ThreadProxy<std::tuple<llvm::StdThreadPool::grow(int)::'lambda'()> >(void*) ThreadPool.cpp:0:0
#25 0x00007a090ce94ac3 start_thread ./nptl/./nptl/pthread_create.c:442:8
#26 0x00007a090cf268c0 ./misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:83:0
Segmentation fault (core dumped)

Change

  1. Removed static variables from shouldConvertToLinalg():

    • Eliminated cachedLinalgOps and linalgOpsMatcher static variables
    • Changed function signature to accept EnableByRegexOption* as a parameter
  2. Store matcher as pattern class member:

    • Added mutable std::unique_ptr<EnableByRegexOption> linalgOpsMatcher to
      ONNXMatMulOpLoweringToLinalg
    • Initialize the matcher in the pattern constructor
    • Each pattern instance now owns its own matcher, eliminating shared state
  3. Made matcher mutable:

    • Declared linalgOpsMatcher as mutable to allow cache updates in const
      matchAndRewrite() method
    • Updated function signature to accept non-const EnableByRegexOption*

Hyun Gyu Kim added 2 commits January 23, 2026 10:57
1
Signed-off-by: Hyun Gyu Kim <kimm240@telepix.net>
2
Signed-off-by: Hyun Gyu Kim <kimm240@telepix.net>
@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

Signed-off-by: Hyun Gyu Kim <kimm240@telepix.net>
@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

Signed-off-by: Hyun Gyu Kim <kimm240@telepix.net>
@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants