Skip to content

[Improvement] Unnecessary Thread Pool Creation even when using only GPU in model graph nodes #27150

@nimdrak

Description

@nimdrak

Describe the issue

Problem Description

When using execution providers that don't require CPU thread pools (e.g., TensorRT EP), thread pools are still created during session construction, even though they won't be used during graph execution.

Current Behavior && Root Cause

  • Thread pools (intra-op and inter-op) are created in ConstructorCommon() at session construction time
  • This happens before execution provider assignment, which occurs later in Initialize()TransformGraph()GraphPartitioner::Partition()
  • For GPU-only EPs with ORT_SEQUENTIAL mode, thread pools are never actually used during execution

Impact

  • Unnecessary resource allocation (thread creation, memory)
  • Thread pools remain idle for the entire session lifetime (although they are blocked)

Propose Solution

Implement lazy initialization for thread pools:

  1. Defer thread pool creation until after EP assignment in Initialize()
  2. Determine necessity based on, during Initialize():
    • Execution mode (ORT_PARALLEL requires inter-op thread pool)
    • Assigned execution providers (CPU EP nodes require thread pools)
    • Graph structure (presence of CPU nodes)
  3. Create thread pools only when needed

To reproduce

None

Urgency

Not urgent

Platform

Linux

OS Version

Ubuntu 24.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.22.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

TensorRT

Execution Provider Library Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions