MaxPool2d - investigate memory layout performance

While working on a fix for `test_large_max_pool_contig` (from issue https://github.com/intel/torch-xpu-ops/issues/2366), it was discovered that the memory layout of the input tensor was implicitly changed from `Contiguous` to `ChannelsLast`. This caused the output to be wrong (indexing mismatch between input and output tensors). A fix was proposed - https://github.com/intel/torch-xpu-ops/pull/2763, but as a follow up task, one should explore the performance of different MaxPool2d kernel variants (whether the channels last kernel is faster than the general one).

CC: @EikanWang 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MaxPool2d - investigate memory layout performance #2766

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MaxPool2d - investigate memory layout performance #2766

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions