-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Open
Description
Describe the issue
Problem Description
When using execution providers that don't require CPU thread pools (e.g., TensorRT EP), thread pools are still created during session construction, even though they won't be used during graph execution.
Current Behavior && Root Cause
- Thread pools (intra-op and inter-op) are created in ConstructorCommon() at session construction time
- This happens before execution provider assignment, which occurs later in Initialize() →
TransformGraph()→GraphPartitioner::Partition() - For GPU-only EPs with
ORT_SEQUENTIALmode, thread pools are never actually used during execution
Impact
- Unnecessary resource allocation (thread creation, memory)
- Thread pools remain idle for the entire session lifetime (although they are blocked)
Propose Solution
Implement lazy initialization for thread pools:
- Defer thread pool creation until after EP assignment in
Initialize() - Determine necessity based on, during
Initialize():- Execution mode (
ORT_PARALLELrequires inter-op thread pool) - Assigned execution providers (CPU EP nodes require thread pools)
- Graph structure (presence of CPU nodes)
- Execution mode (
- Create thread pools only when needed
To reproduce
None
Urgency
Not urgent
Platform
Linux
OS Version
Ubuntu 24.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.22.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
TensorRT
Execution Provider Library Version
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels