-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Hetero support continuous batching #30371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Hetero support continuous batching #30371
Conversation
|
||
jit.AddConstant(MakeJitConstant("REMAINDER_K", !k_full)); | ||
jit.AddConstant(MakeJitConstant("KV_GROUP_SIZE", Q_num_heads_dim.v / K_num_heads_dim.v)); | ||
jit.AddConstant(MakeJitConstant("KV_GROUP_SIZE", 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it mean hetero do not support grouped attention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still under improvement and will be modified from the graph structure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
tensor->copy_to(dst, src_offset, dst_offset + i * get_strides()[0], roi_shape); | ||
i++; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you tried beam search? can it work with hetero? it had problem with tensor parallel, not sure hetero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi bell, I haven't tried beam search yet, will try it.
5bb2915
to
1b6f2ee
Compare
@@ -433,6 +433,9 @@ ov::Any Plugin::get_property(const std::string& name, const ov::AnyMap& options) | |||
if (options.find(ov::device::id.name()) != options.end()) { | |||
device_id = options.find(ov::device::id.name())->second.as<std::string>(); | |||
} | |||
if (!(m_configs_map.find(device_id) != m_configs_map.end())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the debug log.
Details:
Tickets: