[GPU] Add permute kernel for 4D X<->Y swap (order {0,1,3,2}) by zlma7001 · Pull Request #36315 · openvinotoolkit/openvino

zlma7001 · 2026-06-09T06:17:41Z

Introduces PermuteKernel_xy_swap, which transposes the last two axes of 4D bfyx tensors via a TILE_SIZE x TILE_SIZE SLM tile (TILE_SIZE in {32, 16}, WG = 16 x 16) to make both reads and writes coalesced. Supports F16/F32/INT8/UINT8/INT32 with fused ops on static, tile-aligned shapes; registered ahead of the reference permute kernel via FORCE_PRIORITY_2.

Tickets:

CVS-187336

Introduces PermuteKernel_xy_swap, which transposes the last two axes of 4D bfyx tensors via a TILE_SIZE x TILE_SIZE SLM tile (TILE_SIZE in {32, 16}, WG = 16 x 16) to make both reads and writes coalesced. Supports F16/F32/INT8/UINT8/INT32 with fused ops on static, tile-aligned shapes; registered ahead of the reference permute kernel via FORCE_PRIORITY_2.

p-durandin · 2026-06-09T11:14:53Z

build_jenkins

zlma7001 requested review from a team as code owners June 9, 2026 06:17

github-actions Bot added the category: GPU OpenVINO GPU plugin label Jun 9, 2026

sys-openvino-ci added the ExternalIntelPR External contributor from Intel label Jun 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] Add permute kernel for 4D X<->Y swap (order {0,1,3,2})#36315

[GPU] Add permute kernel for 4D X<->Y swap (order {0,1,3,2})#36315
zlma7001 wants to merge 1 commit into
openvinotoolkit:masterfrom
zlma7001:permute_xy

zlma7001 commented Jun 9, 2026

Uh oh!

p-durandin commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zlma7001 commented Jun 9, 2026

Tickets:

Uh oh!

p-durandin commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants