Skip to content

[GPU] Add permute kernel for 4D X<->Y swap (order {0,1,3,2})#36315

Open
zlma7001 wants to merge 1 commit into
openvinotoolkit:masterfrom
zlma7001:permute_xy
Open

[GPU] Add permute kernel for 4D X<->Y swap (order {0,1,3,2})#36315
zlma7001 wants to merge 1 commit into
openvinotoolkit:masterfrom
zlma7001:permute_xy

Conversation

@zlma7001

@zlma7001 zlma7001 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Introduces PermuteKernel_xy_swap, which transposes the last two axes of 4D bfyx tensors via a TILE_SIZE x TILE_SIZE SLM tile (TILE_SIZE in {32, 16}, WG = 16 x 16) to make both reads and writes coalesced. Supports F16/F32/INT8/UINT8/INT32 with fused ops on static, tile-aligned shapes; registered ahead of the reference permute kernel via FORCE_PRIORITY_2.

Tickets:

CVS-187336

Introduces PermuteKernel_xy_swap, which transposes the last two axes of
4D bfyx tensors via a TILE_SIZE x TILE_SIZE SLM tile (TILE_SIZE in {32,
16}, WG = 16 x 16) to make both reads and writes coalesced. Supports
F16/F32/INT8/UINT8/INT32 with fused ops on static, tile-aligned shapes;
registered ahead of the reference permute kernel via FORCE_PRIORITY_2.
@zlma7001 zlma7001 requested review from a team as code owners June 9, 2026 06:17
@github-actions github-actions Bot added the category: GPU OpenVINO GPU plugin label Jun 9, 2026
@sys-openvino-ci sys-openvino-ci added the ExternalIntelPR External contributor from Intel label Jun 9, 2026
@p-durandin

Copy link
Copy Markdown
Contributor

build_jenkins

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GPU OpenVINO GPU plugin ExternalIntelPR External contributor from Intel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants