Skip to content

[GPU] Fix dynamic reorder impl selection in shape-of flow#35622

Open
wilson-seok wants to merge 1 commit intoopenvinotoolkit:masterfrom
wilson-seok:fix/conditional-decoder-reorder
Open

[GPU] Fix dynamic reorder impl selection in shape-of flow#35622
wilson-seok wants to merge 1 commit intoopenvinotoolkit:masterfrom
wilson-seok:fix/conditional-decoder-reorder

Conversation

@wilson-seok
Copy link
Copy Markdown
Contributor

Details:

  • Fix dynamic reorder implementation selection in shape-of flow by relaxing the OCL dynamic predicate in [reorder_impls.cpp]to block only when both input/output formats are simple.
  • Improve implementation fallback robustness in [primitive_inst.cpp]:
    • Continue scanning impl managers when [create()] returns nullptr.
    • Add null guards in async static-compile cache path before kernel compilation/cache insert.
  • Improve failure diagnostics for static impl miss with detailed assertion context (available impl types/shape types and input/output layouts).
  • Add regression coverage:
    • [impls_test.impl_create_null_fallback]
    • [impls_test.reorder_ocl_dynamic_available_in_shape_flow_with_nonsimple_input]
    • [reorder_gpu_f32.dynamic_fsv16_to_bfyx_executes_without_error]

Tickets:

  • 185208

AI Assistance:

  • AI assistance used: yes
  • AI was used to draft and implement the code/test changes, iterate on test design, and prepare PR text.

@wilson-seok wilson-seok requested a review from Copilot April 30, 2026 09:05
@wilson-seok wilson-seok requested review from a team as code owners April 30, 2026 09:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes incorrect GPU reorder implementation selection in shape-of flows and hardens implementation fallback/diagnostics, with added regression tests.

Changes:

  • Relaxed the OCL dynamic reorder predicate in shape-of subgraphs to only block when both input/output formats are simple.
  • Improved impl selection robustness: continue scanning impl managers when create() returns nullptr, and add null-guards in the static compile cache path.
  • Added regression tests for null-impl fallback and dynamic reorder availability/execution.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.

File Description
src/plugins/intel_gpu/tests/unit/test_cases/reorder_gpu_test.cpp Adds a regression test ensuring dynamic fsv16→bfyx reorder executes correctly in shape-of flow conditions.
src/plugins/intel_gpu/tests/unit/module_tests/impls_test.cpp Adds tests for (a) impl creation returning nullptr fallback and (b) OCL dynamic reorder availability in shape-of subgraphs.
src/plugins/intel_gpu/src/graph/registry/reorder_impls.cpp Adjusts shape-of predicate to avoid incorrectly blocking OCL dynamic reorder when input is non-simple.
src/plugins/intel_gpu/src/graph/primitive_inst.cpp Makes impl selection resilient to create() returning nullptr; adds null guards and richer assertion diagnostics on static impl miss.

Comment on lines +390 to +426
// Test: verify that find_impl skips impl managers whose create() returns nullptr
// and continues to the next available impl (fallback behavior)
TEST(impls_test, impl_create_null_fallback) {
auto& engine = get_test_engine();

// Create a list of impl managers where the first one returns nullptr,
// and the second one returns a valid impl
std::vector<std::shared_ptr<ImplementationManager>> test_impls = {
std::make_shared<NullReturningImplementationManager>(shape_types::static_shape),
std::make_shared<SomeImplementationManager>(shape_types::static_shape, nullptr),
};

// Simulate find_impl behavior: iterate through impls, skip nullptr results
kernel_impl_params params;
params.output_layouts = { layout{{1}, data_types::f32, format::bfyx} };
params.input_layouts = { layout{{1}, data_types::f32, format::bfyx} };

program p(engine, get_test_default_config(engine));
auto prim = std::make_shared<some_primitive>("test_fallback", std::vector<input_info>{}, some_primitive::SomeParameter::SUPPORTED_VALUE_ALL);
auto& node = p.get_or_create(prim);
node.recalc_output_layout();

std::unique_ptr<primitive_impl> result = nullptr;
for (auto& impl_manager : test_impls) {
if ((impl_manager->get_shape_type() & shape_types::static_shape) != shape_types::static_shape)
continue;
if (!impl_manager->support_shapes(params))
continue;

auto impl = impl_manager->create(node, params);
if (impl) {
result = std::move(impl);
break;
}
}

// The first impl returns nullptr, but fallback to the second should succeed
std::make_shared<SomeImplementationManager>(shape_types::static_shape, nullptr),
};

// Simulate find_impl behavior: iterate through impls, skip nullptr results
node.recalc_output_layout();

std::unique_ptr<primitive_impl> result = nullptr;
for (auto& impl_manager : test_impls) {
Comment on lines +392 to +426
TEST(impls_test, impl_create_null_fallback) {
auto& engine = get_test_engine();

// Create a list of impl managers where the first one returns nullptr,
// and the second one returns a valid impl
std::vector<std::shared_ptr<ImplementationManager>> test_impls = {
std::make_shared<NullReturningImplementationManager>(shape_types::static_shape),
std::make_shared<SomeImplementationManager>(shape_types::static_shape, nullptr),
};

// Simulate find_impl behavior: iterate through impls, skip nullptr results
kernel_impl_params params;
params.output_layouts = { layout{{1}, data_types::f32, format::bfyx} };
params.input_layouts = { layout{{1}, data_types::f32, format::bfyx} };

program p(engine, get_test_default_config(engine));
auto prim = std::make_shared<some_primitive>("test_fallback", std::vector<input_info>{}, some_primitive::SomeParameter::SUPPORTED_VALUE_ALL);
auto& node = p.get_or_create(prim);
node.recalc_output_layout();

std::unique_ptr<primitive_impl> result = nullptr;
for (auto& impl_manager : test_impls) {
if ((impl_manager->get_shape_type() & shape_types::static_shape) != shape_types::static_shape)
continue;
if (!impl_manager->support_shapes(params))
continue;

auto impl = impl_manager->create(node, params);
if (impl) {
result = std::move(impl);
break;
}
}

// The first impl returns nullptr, but fallback to the second should succeed
}

in_out_fmts_t query_formats(const program_node& node) const override {
OPENVINO_NOT_IMPLEMENTED;
Comment on lines +3195 to +3197
OPENVINO_ASSERT(false, "No static impl " + node->id() + ". " + available_impls_info +
". Input: " + updated_params.input_layouts[0].to_short_string() +
". Output: " + updated_params.output_layouts[0].to_short_string());
Comment on lines +3190 to +3192
for (auto& m : m_available_impls) {
available_impls_info += " {type=" + std::to_string(static_cast<int>(m->get_impl_type())) +
", shape=" + std::to_string(static_cast<int>(m->get_shape_type())) + "}";

// Prepare properly formatted fsv16 memory via helper network
auto input_bfyx = engine.allocate_memory({ov::PartialShape(in_shape), data_types::f32, format::bfyx});
std::vector<float> input_data(256);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GPU OpenVINO GPU plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants