[NPUW] gemma-2 patterns added to preserve tail constants matcher#32465
[NPUW] gemma-2 patterns added to preserve tail constants matcher#32465dmatveev merged 23 commits intoopenvinotoolkit:masterfrom
Conversation
src/plugins/intel_npu/src/plugin/npuw/partitioning/patterns/opt.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/src/plugin/npuw/partitioning/patterns/opt.cpp
Outdated
Show resolved
Hide resolved
|
This PR will be closed in a week because of 2 weeks of no activity. |
|
This PR was closed because it has been stalled for 2 week with no activity. |
|
This PR will be closed in a week because of 2 weeks of no activity. |
|
This PR was closed because it has been stalled for 2 week with no activity. |
src/plugins/intel_npu/src/plugin/npuw/partitioning/patterns/opt.cpp
Outdated
Show resolved
Hide resolved
| if (!m_use_chunk_prefill) { | ||
| // TODO: sometimes it is ok if we cannot find any empty inputs or not? | ||
| NPUW_ASSERT(remove_empty_kv_inputs(prefill_model)); | ||
| remove_empty_kv_inputs(prefill_model); |
There was a problem hiding this comment.
it is especially needed and a leftover for fp8-cb4 work
There was a problem hiding this comment.
@AsyaPronina indeed i found a source of the problem - in fp8 patterns we used for redirect - that one also need to be checked when empty kv inputs removed with concats etc in prefill - please have a look on actual implementation
| users.insert(users.end(), param_users.begin(), param_users.end()); | ||
| } | ||
|
|
||
| // Remove duplicates (concat itself will appear in users) |
There was a problem hiding this comment.
Is users a set and duplicates are removed because of that?
There was a problem hiding this comment.
@AsyaPronina I think we are not removing duplicates anyway - just searching for shapeof user, concat of course appeared - comment looks not very relevant
| auto param = opp::wrap_type<ov::op::v0::Parameter>(); | ||
| auto concat = opp::wrap_type<ov::op::v0::Concat>({param, opp::any_input()}); | ||
| auto param_or = | ||
| std::make_shared<opp::op::Or>(ov::OutputVector{param, match_down_up_convert_subgraph_after_lpt(param)}); |
726fdaf
Details:
Tickets: