-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Open
Labels
performanceissues related to performance regressionsissues related to performance regressions
Description
Describe the issue
Description
We observed a performance regression in the MatMul operator when using int64 data type inputs between ONNXRuntime v1.20.0 and v1.21.0. This regression is specific to int64 type - other data types (e.g., float32, int32) are not affected.
Affected Operator
MatMul
- Opset Version: 13
- Data Type: int64 (regressed)
- Regression: +72% to +144% kernel slowdown
Test Case Details
Test Case 1: matmul_13_v2_matmul_int64_mixed_rank_broadcast
Inputs:
-
input_0 tensor:
- Data type: int64 (type=7)
- Shape: [3, 128, 256]
-
input_1 tensor:
- Data type: int64 (type=7)
- Shape: [256, 64]
Output:
- Data type: int64
- Shape: [3, 128, 64]
- Matrix multiplication with broadcast
Performance:
- v1.20.0: 4.54 ms (kernel time)
- v1.21.0: 11.11 ms (kernel time)
- Kernel regression: +144.4% slowdown
- Total time regression: +144.7% slowdown
Test Case 2: matmul_13_v3_test_matmul_2d_int64
Inputs:
-
A tensor:
- Data type: int64 (type=7)
- Shape: [32, 24]
-
B tensor:
- Data type: int64 (type=7)
- Shape: [24, 10]
Performance:
- v1.20.0: 0.011 ms (kernel time)
- v1.21.0: 0.019 ms (kernel time)
- Kernel regression: +72.7% slowdown
Test Case 3: matmul_13_v3_test_matmul_single_batch_edge_int64
Inputs:
-
A tensor:
- Data type: int64 (type=7)
- Shape: [1, 32, 64]
-
B tensor:
- Data type: int64 (type=7)
- Shape: [1, 64, 16]
Performance:
- v1.20.0: 0.034 ms (kernel time)
- v1.21.0: 0.060 ms (kernel time)
- Kernel regression: +75.0% slowdown
Regression Characteristics
Type-Specific Regression
REGRESSED (int64):
matmul_13_v2_matmul_int64_mixed_rank_broadcast: +144.4% slowdownmatmul_13_v3_test_matmul_2d_int64: +72.7% slowdownmatmul_13_v3_test_matmul_single_batch_edge_int64: +75.0% slowdown
NOT REGRESSED (float32):
matmul_13_v2_matmul_float32_large_2d: +1.5% (stable)matmul_13_v2_matmul_float32_batched_3d: No regression
NOT REGRESSED (int32):
matmul_13_v2_matmul_int32_batched: No regression
To reproduce
- Download zip file
- Run benchmark using the provided script:
python script_profiling.py matmul_13_v2_matmul_int64_mixed_rank_broadcast 1.20.0 1.21.0
Urgency
No response
Platform
Linux
OS Version
Ubuntu 24.04.3 LTS
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.21
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
No response
Is this a quantized model?
Yes
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
performanceissues related to performance regressionsissues related to performance regressions