Skip to content

Keep dequantization subgraph output as inference precision for GPU plugin #30685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

z71258847
Copy link

Details:

  • Insert convert op as subgraph endpoint for identified u16, i32, u32 dequantization subgraph, in order to keep constant folding to compute in fp32 but output as the current inference precision (fp16) for GPU plugin.

Tickets:

@z71258847 z71258847 requested a review from a team as a code owner May 23, 2025 05:08
@z71258847 z71258847 requested review from itikhono and removed request for a team May 23, 2025 05:08
@github-actions github-actions bot added the category: transformations OpenVINO Runtime library - Transformations label May 23, 2025
@sys-openvino-ci sys-openvino-ci added the ExternalIntelPR External contributor from Intel label May 23, 2025
@sshlyapn
Copy link
Contributor

@z71258847 please move this extra logic to GPU specific transformations

@z71258847 z71258847 closed this May 23, 2025
@z71258847 z71258847 reopened this May 23, 2025
@z71258847 z71258847 requested review from a team as code owners May 23, 2025 08:11
@github-actions github-actions bot added category: GPU OpenVINO GPU plugin and removed category: transformations OpenVINO Runtime library - Transformations labels May 23, 2025
@p-durandin
Copy link
Contributor

build_jenkins

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GPU OpenVINO GPU plugin ExternalIntelPR External contributor from Intel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants