-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Open
Labels
performanceissues related to performance regressionsissues related to performance regressions
Description
Describe the issue
Description
We observed a performance regression in the Hardmax operator when using explicit axis=1 configuration between ONNXRuntime v1.18.0 and v1.19.0. This regression is axis-configuration specific - default axis configurations show no regression or even improvement.
Affected Operator
Hardmax
- Opset Version: 13
- Data Type: float32
- Configuration: explicit axis=1
- Regression: +8.85% kernel slowdown (+14.42% total time)
Test Case Details
Test Case: hardmax_13_v3_test_hardmax_float32_axis1
Input:
- input tensor:
- Data type: float32
- Shape: [2, 64, 56, 56]
- Total elements: 401,408
Attributes:
- axis: 1 (explicit, non-last axis)
Output:
- Data type: float32
- Shape: [2, 64, 56, 56]
Operation:
Computes hardmax (one-hot encoding of argmax) along axis 1.
Performance:
- v1.18.0: 17.05 ms (kernel time)
- v1.19.0: 18.56 ms (kernel time)
- Kernel regression: +8.85% slowdown
- Total time regression: +14.42% slowdown
Regression Characteristics
Axis-Specific Regression
REGRESSED (explicit axis=1):
hardmax_13_v3_test_hardmax_float32_axis1: +8.85% slowdown- Shape: [2, 64, 56, 56], axis=1
REGRESSED (negative axis):
hardmax_13_v3_test_hardmax_float32_negative_axis: +6.24% slowdown
NOT REGRESSED (default axis):
hardmax_13_v3_test_hardmax_basic_float32_default_axis: +0.12% (stable)- Shape: [2, 3, 32, 32], default axis
IMPROVED (default axis, larger tensor):
hardmax_hardmax_13_hardmax_default_axis_float32_4d: -65.33% improvement- Shape: [2, 64, 28, 28], default axis
To reproduce
- Download zip file
- Run benchmark using the provided script: ```bash
python script_profiling.py hardmax_13_v3_test_hardmax_float32_axis1 1.18.0 1.19.0
Urgency
No response
Platform
Linux
OS Version
Ubuntu 24.04.3 LTS
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.19
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
No response
Is this a quantized model?
Yes
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
performanceissues related to performance regressionsissues related to performance regressions