Skip to content

[GPU] Fix int8 cache in onednn convolution #30606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

davidsnam-intel
Copy link
Contributor

Details:

  • Fixed the issue of garbage values ​​coming in when using cache in onednn convolution in the yolo11 model.

Tickets:

  • 162501

@davidsnam-intel davidsnam-intel requested review from a team as code owners May 18, 2025 22:47
@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label May 18, 2025
strides, dilates, padding_l, padding_r,
*_attrs.get());
_pd = *prim_desc;
dnnl::memory::desc bias_md = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: dnnl::memory::desc bias_md; will be more explicit for zero_md according to onednn api.

Copy link
Contributor Author

@davidsnam-intel davidsnam-intel May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For convolution_onednn, there is a constructor A that does not receive bias as a parameter and a constructor B that does. Eventually, A calls B, and at this time, it explicitly puts nullptr in the bias part. In this situation, how do you think about this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is to replace
dnnl::memory::desc bias_md = nullptr; to dnnl::memory::desc bias_md; for explicit use of API. They seems to be basically same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I understood and tried as you suggested. I couldn't prove perfectly why, but the cache works properly only when nullptr is explicitly assigned. I didn't compare memory stack both when nullptr is explicitly assigned and when it isn't, using any specific tool. However, I confirmed through many tests that cache works properly only when nullptr is explicitly assigned.

@p-durandin p-durandin added this to the 2025.2 milestone May 19, 2025
dnnl::prop_kind::forward_inference, dnnl::algorithm::convolution_direct,
input_md, weights_md, bias_md, output_md,
strides, dilates, padding_l, padding_r,
*_attrs.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior seems exactly same as below. Am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt the issue is hidden by compilation result change, but the real issue is unknown..

@davidsnam-intel davidsnam-intel force-pushed the david/fix-convolution-onednn--int8-cache branch from 226b3f2 to 0278c1d Compare May 20, 2025 22:35
@mlukasze mlukasze requested review from isanghao and yeonbok May 21, 2025 03:06
@geunhwan geunhwan removed this from the 2025.2 milestone May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GPU OpenVINO GPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants