Skip to content

[Performance] Performance regression in Hardmax operator with axis=1 between v1.18.0 and v1.19.0 #27173

@junghyunpark2001

Description

@junghyunpark2001

Describe the issue

Description

We observed a performance regression in the Hardmax operator when using explicit axis=1 configuration between ONNXRuntime v1.18.0 and v1.19.0. This regression is axis-configuration specific - default axis configurations show no regression or even improvement.

Affected Operator

Hardmax

  • Opset Version: 13
  • Data Type: float32
  • Configuration: explicit axis=1
  • Regression: +8.85% kernel slowdown (+14.42% total time)

Test Case Details

Test Case: hardmax_13_v3_test_hardmax_float32_axis1

Input:

  • input tensor:
    • Data type: float32
    • Shape: [2, 64, 56, 56]
    • Total elements: 401,408

Attributes:

  • axis: 1 (explicit, non-last axis)

Output:

  • Data type: float32
  • Shape: [2, 64, 56, 56]

Operation:
Computes hardmax (one-hot encoding of argmax) along axis 1.

Performance:

  • v1.18.0: 17.05 ms (kernel time)
  • v1.19.0: 18.56 ms (kernel time)
  • Kernel regression: +8.85% slowdown
  • Total time regression: +14.42% slowdown

Regression Characteristics

Axis-Specific Regression

REGRESSED (explicit axis=1):

  • hardmax_13_v3_test_hardmax_float32_axis1: +8.85% slowdown
    • Shape: [2, 64, 56, 56], axis=1

REGRESSED (negative axis):

  • hardmax_13_v3_test_hardmax_float32_negative_axis: +6.24% slowdown

NOT REGRESSED (default axis):

  • hardmax_13_v3_test_hardmax_basic_float32_default_axis: +0.12% (stable)
    • Shape: [2, 3, 32, 32], default axis

IMPROVED (default axis, larger tensor):

  • hardmax_hardmax_13_hardmax_default_axis_float32_4d: -65.33% improvement
    • Shape: [2, 64, 28, 28], default axis

To reproduce

  1. Download zip file

Archive.zip

  1. Run benchmark using the provided script: ```bash
    python script_profiling.py hardmax_13_v3_test_hardmax_float32_axis1 1.18.0 1.19.0
    
    

Urgency

No response

Platform

Linux

OS Version

Ubuntu 24.04.3 LTS

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.19

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceissues related to performance regressions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions