[Performance]  ORT-WebGPU Average Pooling is working too long in edge case

### Describe the issue

I'm using ORT with WebGPU for inference. I did profiling and found that 4 AveragePoolings takes almost half of model inference speed.

![Image](https://github.com/user-attachments/assets/9868d904-afac-4b35-b89d-68a45263df30)

Here is example of operation:

![Image](https://github.com/user-attachments/assets/973e7a05-422c-41f2-91a0-5401c9bd5b45)

AvgPool inference time has linear dependence on kernel size

Shapes // Time
* (20, 32) = 640 // 0.2ms;
* (40, 64) = 2560 (x4) // 0.8ms (x4)
* (80, 128) = 10240 (x16) // 3.2ms (x16)

Problem here is that it is an edge case, because it is literally ReduceMean on dims=(2,3). I changed the operation and got giant speed boost.

![Image](https://github.com/user-attachments/assets/96bae90a-46c7-4da4-800e-25580cfff44d)

I think that it is an edge case (that AvgPool == ReduceMean) and you can create a special handler for it.

### To reproduce

Just create AvgPool with big kernel_size

### Urgency

Not urgent, workaround was found

### Platform

Mac

### OS Version

15.2

### ONNX Runtime Installation

Released Package

### ONNX Runtime Version or Commit ID

1.20.1

### ONNX Runtime API

JavaScript

### Architecture

ARM64

### Execution Provider

Other / Unknown

### Execution Provider Library Version

_No response_

### Model File

_No response_

### Is this a quantized model?

No

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Performance] ORT-WebGPU Average Pooling is working too long in edge case #23614

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Performance] ORT-WebGPU Average Pooling is working too long in edge case #23614

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions