Skip to content

Commit 7ea260c

Browse files
Updated logic whether PA backend is explicitly required (openvinotoolkit#1976)
If user passed `LLMPipeline(model_path, device, ATTENTION_BACKEND=PA)`, then it should throw in case of PA backend cannot be used (either PA op is not available or model cannot be converted to PA representation)
1 parent 1086ac3 commit 7ea260c

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

src/cpp/src/llm_pipeline.cpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,10 @@ SchedulerConfig get_latency_oriented_scheduler_config() {
4545
}
4646

4747
bool explicitly_requires_paged_attention(const ov::AnyMap& properties) {
48-
if (properties.find(ov::genai::scheduler_config.name()) != properties.end()) {
48+
auto attention_backend_it = properties.find("ATTENTION_BACKEND");
49+
50+
if (properties.find(ov::genai::scheduler_config.name()) != properties.end() ||
51+
(attention_backend_it != properties.end() && attention_backend_it->second.as<std::string>() == PA_BACKEND)) {
4952
if (is_paged_attention_available()) {
5053
return true;
5154
} else {

0 commit comments

Comments
 (0)