Skip to content

[FORK][FEATURE][x64] 3D int4 reorders for FC layout#292

Merged
maxnick merged 2 commits intov3.8_for_ie_masterfrom
mkutakov/int4_3d_fc_reorder
Nov 5, 2025
Merged

[FORK][FEATURE][x64] 3D int4 reorders for FC layout#292
maxnick merged 2 commits intov3.8_for_ie_masterfrom
mkutakov/int4_3d_fc_reorder

Conversation

@maxnick
Copy link
Collaborator

@maxnick maxnick commented Oct 22, 2025

Description

This PR adds absent 3d FC related reorders for 4bit data types.

OpenVINO PR: openvinotoolkit/openvino#32450

github-merge-queue bot pushed a commit to openvinotoolkit/openvino that referenced this pull request Nov 5, 2025
### Details:
In this PR we introduce yet another operation "GatherMatmu", which
essentially does gemv operations over the current tokens and the active
experts.
As the first step, we perform gemv operation using the
dnnl::inner_product. But obviously this solution is suboptimal, as it
doesn't give a fine grain control over parallelization, and in the case
of many tokens being processed by a specific expert (prefill), having
gemm operation may be more optimal as the tokens may be batched and we
can do SIMD level parallelization by tokens as well.
Also this PR contains all the essential transformations that allow to
enable a few common MoE patterns.

MoE pattern matcher is based on
#32183

Related oneDNN fork PR:
openvinotoolkit/oneDNN#292

### Tickets:
 - CVS-171910

---------

Co-authored-by: Vladislav Golubev <vladislav.golubev@intel.com>
@maxnick maxnick merged commit a4ed4a7 into v3.8_for_ie_master Nov 5, 2025
6 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments