Skip to content

Hetero support continuous batching #30371

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

WeldonWangwang
Copy link
Contributor

@WeldonWangwang WeldonWangwang commented Apr 29, 2025

Details:

  • item1
  • ...

Tickets:

@github-actions github-actions bot added category: Core OpenVINO Core (aka ngraph) category: GPU OpenVINO GPU plugin category: HETERO OpenVINO HETERO plugin labels Apr 29, 2025
@WeldonWangwang WeldonWangwang changed the title Wangwang/hetero support cb hetero support continuous batching Apr 29, 2025
@WeldonWangwang WeldonWangwang changed the title hetero support continuous batching Hetero support continuous batching Apr 29, 2025
@github-actions github-actions bot removed the category: Core OpenVINO Core (aka ngraph) label Apr 30, 2025

jit.AddConstant(MakeJitConstant("REMAINDER_K", !k_full));
jit.AddConstant(MakeJitConstant("KV_GROUP_SIZE", Q_num_heads_dim.v / K_num_heads_dim.v));
jit.AddConstant(MakeJitConstant("KV_GROUP_SIZE", 1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it mean hetero do not support grouped attention?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still under improvement and will be modified from the graph structure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

tensor->copy_to(dst, src_offset, dst_offset + i * get_strides()[0], roi_shape);
i++;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tried beam search? can it work with hetero? it had problem with tensor parallel, not sure hetero

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi bell, I haven't tried beam search yet, will try it.

@WeldonWangwang WeldonWangwang marked this pull request as ready for review May 11, 2025 14:25
@WeldonWangwang WeldonWangwang requested review from a team as code owners May 11, 2025 14:25
@WeldonWangwang WeldonWangwang marked this pull request as draft May 12, 2025 01:18
@github-actions github-actions bot removed the category: GPU OpenVINO GPU plugin label May 15, 2025
@WeldonWangwang WeldonWangwang marked this pull request as ready for review May 15, 2025 14:33
@zhaixuejun1993 zhaixuejun1993 force-pushed the wangwang/hetero_support_cb branch from 5bb2915 to 1b6f2ee Compare May 16, 2025 01:25
@@ -433,6 +433,9 @@ ov::Any Plugin::get_property(const std::string& name, const ov::AnyMap& options)
if (options.find(ov::device::id.name()) != options.end()) {
device_id = options.find(ov::device::id.name())->second.as<std::string>();
}
if (!(m_configs_map.find(device_id) != m_configs_map.end())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the debug log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GPU OpenVINO GPU plugin category: HETERO OpenVINO HETERO plugin do_not_merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants