Skip to content

zero copy permute opt for conv#33827

Open
AshutoshSinghIntel wants to merge 2 commits intoopenvinotoolkit:masterfrom
AshutoshSinghIntel:feature/optimize-permute-conv-pattern
Open

zero copy permute opt for conv#33827
AshutoshSinghIntel wants to merge 2 commits intoopenvinotoolkit:masterfrom
AshutoshSinghIntel:feature/optimize-permute-conv-pattern

Conversation

@AshutoshSinghIntel
Copy link
Contributor

@AshutoshSinghIntel AshutoshSinghIntel commented Jan 27, 2026

Summary:

  • This change targets optimization for permute->convolution subgraph typical in vision transformers. By aligning the Permute output to the Convolution's Channel-Last format, we achieve two performance wins:
  1. Elimination of Reorder Nodes: Removing expensive data copy operations.
  2. Zero-Copy "Virtual Transpose": Converting physical data movement into a logical layout cast

Implementation Details:

  • reorder is getting added for changing planar to blocked fmt for conv
  • existing perm->conv optimization optimize_out the permute operation and leaving the physical data in byxf. Since Input Physical == Output Physical, the Permute becomes a Zero-Copy Layout Cast.
  • enforcing byxf format for conv eliminates the reorder node and improves performance
  • for more details, kindly refer CVS-177918

Graph:

image

Performance Improvement:

image

Test details:

  • EBGAN TF FP32 showed ~50% degradation, need to rerun to make sure that there was no temporary issue with machine during the run. EBGAN TF FP16 did not show such significant degradation.
  • Geomean for rest of the models is fine.
  • Full validation benchmark report is here
image

Tickets:

@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label Jan 27, 2026
@sys-openvino-ci sys-openvino-ci added the ExternalPR External contributor label Jan 27, 2026
@AshutoshSinghIntel AshutoshSinghIntel marked this pull request as ready for review January 27, 2026 15:54
@AshutoshSinghIntel AshutoshSinghIntel requested review from a team as code owners January 27, 2026 15:54
@AshutoshSinghIntel AshutoshSinghIntel marked this pull request as draft January 27, 2026 15:59
@AshutoshSinghIntel AshutoshSinghIntel marked this pull request as ready for review February 5, 2026 11:55
@AshutoshSinghIntel AshutoshSinghIntel marked this pull request as draft February 5, 2026 11:57
@AshutoshSinghIntel AshutoshSinghIntel marked this pull request as ready for review February 12, 2026 18:52
@AshutoshSinghIntel
Copy link
Contributor Author

build_jenkins

@p-durandin p-durandin added this to the 2026.1 milestone Feb 19, 2026
// Dependency must be a Permute node (not network output)
if (!dep.is_type<permute>() || dep.is_output())
return;
//if (dep.get_users().size() != 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check that node has only one user to avoid breaking data for other nodes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @p-durandin,
in this case, dependency node, pnode (permute), has two users: permute -> conv, permute -> permute
while node var which is conv node has one user: conv -> permute

For which node do you want to add the check?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GPU OpenVINO GPU plugin ExternalPR External contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants