Minor improvements to token_type_ids extension for PA#34661
Minor improvements to token_type_ids extension for PA#34661p-wysocki wants to merge 33 commits intoopenvinotoolkit:masterfrom
token_type_ids extension for PA#34661Conversation
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
…nto attn_idea_2 Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
…into attn_idea_2 Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
…into attn_fixes Signed-off-by: p-wysocki <przemyslaw.wysocki@intel.com>
| // Shared flag to track whether the model is Gemma3, set when any layer matches | ||
| // the gptoss_gemma3 sliding window pattern. Combined with the token_type_ids check, | ||
| // this uniquely identifies Gemma3 (gpt-oss shares the pattern but lacks token_type_ids). | ||
| auto is_gptoss_gemma3 = std::make_shared<bool>(false); |
There was a problem hiding this comment.
Can we define this variable inside the callback?
|
|
||
| static std::shared_ptr<ov::Node> handle_gemma3_token_type_ids( | ||
| const std::map<std::string, std::shared_ptr<v0::Parameter>>& optional_model_wide_params) { | ||
| if (optional_model_wide_params.find("token_type_ids") != optional_model_wide_params.end()) { |
There was a problem hiding this comment.
I guess you don't need this check anymore here.
| sliding_window = std::make_shared<v1::Subtract>(v0::Constant::create(element::i32, Shape{}, {2}), offset); | ||
| } else if (pattern_map.count(gptoss_gemma3_offset)) { | ||
| *is_gptoss_gemma3 = true; | ||
| is_gemma3 = optional_model_wide_params.count("token_type_ids"); |
There was a problem hiding this comment.
In fact any model with token_type_ids and matching sliding window pattern will set this is_gemma3 flag true, why not simply name this variable has_token_type_ids?
Or set has_sliding_window here instead, and use below.
Also currently is_gemma3 will be false for causal mask case (no sliding window) within the same model.
| if (is_gemma3) { | ||
| pa_arguments.insert(pa_arguments.begin() + 25, handle_gemma3_token_type_ids(optional_model_wide_params)); | ||
| } else { | ||
| pa_arguments.insert(pa_arguments.begin() + 25, v0::Constant::create(element::i32, Shape{0}, {})); |
There was a problem hiding this comment.
The variable naming is tight to gemma3 but it can be generic for any model having has_token_type_ids and has_sliding_window true.
It is currently applied for sliding_window case only, but as a next step it could be extended to causal case as well then this if else will be reduced to single case:
pa_arguments.insert(pa_arguments.begin() + 25, handle_token_type_ids(optional_model_wide_params));
| if (is_gemma3) { | |
| pa_arguments.insert(pa_arguments.begin() + 25, handle_gemma3_token_type_ids(optional_model_wide_params)); | |
| } else { | |
| pa_arguments.insert(pa_arguments.begin() + 25, v0::Constant::create(element::i32, Shape{0}, {})); | |
| if (has_sliding_window) { | |
| pa_arguments.insert(pa_arguments.begin() + 25, handle_token_type_ids(optional_model_wide_params)); | |
| } else { | |
| pa_arguments.insert(pa_arguments.begin() + 25, v0::Constant::create(element::i32, Shape{0}, {})); |
Details:
Tickets: