Skip to content

[Performance] Performance Bottleneck due to intra_op_num_threads being set globally #24101

Open
@spgoswami1

Description

@spgoswami1

Describe the issue

Here is the thing.

  1. if I keep intra_op_num_threads as 0 (default, uses all threads) then the convolution time comes down but rest other operators times increase which results in almost similar time or sometimes higher as compared to intra_op_num_threads = 1 option
  2. intra_op_num_threads = 1, conv is taking 62 % of total time
  3. intra_op_num_threads = 0, conv is 28 % of total time

Could not find any setting yet which enables intra_op_num_threads as 0 for conv only and keep it 1 for rest others.

Is there any setting in the code which lets me do this so that I can build the onnxruntime on my own locally?

To reproduce

I am facing this issue for my propritery model, so won't be able to share, though I can create a dummy model including Conv and Tanh and see this trend.

Urgency

No response

Platform

Windows

OS Version

10.0.26100 Build 26100

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.21.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

No

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceissues related to performance regressions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions