Skip to content

Aanuf/sdpa v fp8 #3485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: develop
Choose a base branch
from

Conversation

andreyanufr
Copy link
Collaborator

@andreyanufr andreyanufr commented May 8, 2025

Changes

FakeConverr for V tensor for SDPA layer in the case of FP8 quantization for NPU performance.

Reason for changes

Related tickets

CVS-166427

Tests

In process.

@github-actions github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF Common Pull request that updates NNCF Common NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PTQ Pull requests that updates NNCF PTQ labels May 8, 2025
@andreyanufr andreyanufr marked this pull request as ready for review May 8, 2025 14:01
@andreyanufr andreyanufr requested a review from a team as a code owner May 8, 2025 14:01
@@ -333,6 +333,7 @@ def __init__(
post_processing_marker_metatypes: Optional[list[type[OperatorMetatype]]] = None,
metatypes_to_ignore: Optional[list[type[OperatorMetatype]]] = None,
scales_unification_map: Optional[dict[type[OperatorMetatype], list[type[OperatorMetatype]]]] = None,
is_fp8: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyanufr, @AlexanderDokuchaev, please provide suggestion how to avoid passing is_fp8 parameter in the solver.

Copy link
Contributor

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed offline, please add support for attention subgraph without sdpa via disabling ignored patterns. cc' @xiao1228

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF Common Pull request that updates NNCF Common NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants