Skip to content

[Performance] Performance regression in Div operator for scalar-to-array int64 broadcast between v1.18.0 and v1.19.0 #27182

@junghyunpark2001

Description

@junghyunpark2001

Describe the issue

Description

We observed a performance regression in the Div operator when performing scalar-to-array division with int64 data type between ONNXRuntime v1.18.0 and v1.19.0. This regression is specific to the int64 scalar broadcast configuration.

Affected Operator

Div

  • Opset Version: 14
  • Data Type: int64
  • Configuration: Scalar [1] broadcasting to array [16, 32]
  • Regression: +23.0% kernel slowdown

Test Case Details

Test Case: div_div_14_div_scalar_int64

Inputs:

  • A tensor (dividend):

    • Data type: int64 (type=7)
    • Shape: [1] (scalar)
  • B tensor (divisor):

    • Data type: int64 (type=7)
    • Shape: [16, 32] (512 elements)

Output:

  • Data type: int64
  • Shape: [16, 32] (broadcast from scalar)

Performance:

  • v1.18.0: 0.0027 ms (kernel time)
  • v1.19.0: 0.0033 ms (kernel time)
  • Kernel regression: +23.0% slowdown
  • Confirmation: 4/10 validation runs confirmed

Regression Characteristics

Affected Configuration (Confirmed)

  • Data type: int64
  • Broadcast pattern: Scalar to array
  • Input shapes: [1] / [16, 32]

Key Characteristics

  • Type-specific: int64 scalar broadcast affected
  • Broadcast pattern: Scalar-to-array division
  • Opset version: 14
  • Confirmed: Multiple validation runs confirmed the regression

To reproduce

  1. Download zip file

Archive.zip

  1. Run benchmark using the provided script:
    python script_profiling.py div_div_14_div_scalar_int64 1.18.0 1.19.0

Urgency

No response

Platform

Linux

OS Version

Ubuntu 24.04.3 LTS

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.19.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceissues related to performance regressions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions