Skip to content

[GPU] Make singleton OCL Context#32738

Merged
Wovchena merged 6 commits intoopenvinotoolkit:masterfrom
isanghao:singleton_ocl_context
Feb 5, 2026
Merged

[GPU] Make singleton OCL Context#32738
Wovchena merged 6 commits intoopenvinotoolkit:masterfrom
isanghao:singleton_ocl_context

Conversation

@isanghao
Copy link
Contributor

@isanghao isanghao commented Nov 7, 2025

Details:

  • Make OCL context as singleton
  • GenAI tries to change its behavior to multi ov::Core, while buffers can be shared across ov::Core
  • We had two choices
    • explicit context sharing among multiple ov::Core
    • singleton OCL context
  • The expectation is that single OCL context is created for one process. So it should be fine to use singleton ocl context.
  • Singleton OCL context was implemented with weak ptr. It was design choice to release cl_context before plugin unload.
  • As Plugin class owns cl_context through shared_ptr, singleton cl_context will be freed when all Core instances are destructed.

Tickets:

@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label Nov 7, 2025
@isanghao isanghao force-pushed the singleton_ocl_context branch from e56e26f to 5a3c56e Compare November 24, 2025 11:33
@github-actions github-actions bot added category: inference OpenVINO Runtime library - Inference category: CPP API OpenVINO CPP API bindings labels Nov 24, 2025
@isanghao isanghao changed the title WIP: [GPU] Make singleton OCL Context [GPU] Make singleton OCL Context Nov 24, 2025
@isanghao isanghao marked this pull request as ready for review November 24, 2025 11:34
@isanghao isanghao requested review from a team as code owners November 24, 2025 11:34
@isanghao isanghao requested review from a team as code owners November 25, 2025 03:47
@isanghao isanghao requested review from ValentinaKats and removed request for a team November 25, 2025 03:47
@github-actions github-actions bot added category: Python API OpenVINO Python bindings category: docs OpenVINO documentation category: JS API OpenVino JS API Bindings labels Nov 25, 2025
@Wovchena Wovchena requested a review from Copilot November 25, 2025 05:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements singleton OCL context for the Intel GPU plugin to enable buffer sharing across multiple ov::Core instances. The change ensures that a single OCL context is created per process rather than per Core instance, allowing GenAI applications to use multiple Core instances while sharing GPU resources efficiently.

Key changes:

  • Moved OCL context initialization from instance-level to static singleton in Plugin::get_default_contexts()
  • Updated documentation across C++, Python, JavaScript, and Node.js bindings to reflect the new multi-Core behavior
  • Added test to verify singleton OCL context behavior across multiple Core instances

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/plugins/intel_gpu/tests/functional/behavior/infer_request.cpp Added test case verifying that OCL contexts from different Core instances are identical
src/plugins/intel_gpu/src/plugin/plugin.cpp Moved m_default_contexts and m_default_contexts_once from member variables to static variables for singleton behavior
src/plugins/intel_gpu/include/intel_gpu/plugin/plugin.hpp Removed member variable declarations that are now static in the implementation
src/inference/include/openvino/runtime/core.hpp Updated documentation to clarify that multiple Core instances now share underlying device resources
src/bindings/python/src/pyopenvino/core/core.cpp Updated Python binding documentation to reflect resource sharing behavior
src/bindings/python/src/openvino/_ov_api.pyi Updated Python type stub documentation for consistency
src/bindings/js/node/lib/addon.ts Updated JavaScript/TypeScript documentation to reflect resource sharing
docs/sphinx_setup/api/nodejs_api/openvino-node/interfaces/Core.rst Updated Node.js API documentation for consistency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@isanghao isanghao enabled auto-merge December 15, 2025 04:29
@isanghao isanghao force-pushed the singleton_ocl_context branch from ceee6c1 to e1b7306 Compare December 17, 2025 03:30
@StefaniaHergane
Copy link
Contributor

StefaniaHergane commented Jan 15, 2026

Please check behavior of sample app hello_query_device - it is executed in NPU CI and hangs on some Windows machines.

@isanghao isanghao force-pushed the singleton_ocl_context branch from ad172d0 to ff30c9f Compare January 20, 2026 05:08
@isanghao isanghao added this pull request to the merge queue Jan 21, 2026
@geunhwan geunhwan added this to the 2026.0 milestone Jan 21, 2026
@isanghao isanghao removed this pull request from the merge queue due to a manual request Jan 21, 2026
@isanghao isanghao modified the milestones: 2026.0, 2026.1 Jan 21, 2026
@isanghao
Copy link
Contributor Author

This is not required for 2026.0 because genai is still using singleton core. openvinotoolkit/openvino.genai#2952

This can be merged in 2026.1 timeframe.

ov::DeviceIDParser parser(device_name);
std::string devName = parser.get_device_name();

_impl->get_plugin(devName).cleanup();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@praasz could you review this part? I added new plugin API cleanup to ensure context cleanup before unloading plugin. Without this, it caused hang on windows.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isanghao
I agree with @olpipi idea to clean up in GPU plugin dtor if required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @praasz @olpipi. I realized that this new strategy does not need explicit cleanup. I reverted the changes.

@@ -265,17 +271,17 @@ void Core::unload_plugin(const std::string& device_name) {
OV_CORE_CALL_STATEMENT({
ov::DeviceIDParser parser(device_name);
std::string devName = parser.get_device_name();

_impl->get_plugin(devName).cleanup();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What reason to add this new method to plugins api? Can you perform the same steps inside of ov::intel_gpu::Plugin dtor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @praasz @olpipi. I realized that this new strategy does not need explicit cleanup. I reverted the changes.

Copy link
Contributor

@ahnyoung-paul ahnyoung-paul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jade-cho jade-cho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@isanghao isanghao added this pull request to the merge queue Feb 5, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 5, 2026
@Wovchena Wovchena added this pull request to the merge queue Feb 5, 2026
Merged via the queue into openvinotoolkit:master with commit 8322000 Feb 5, 2026
212 checks passed
Naseer-010 pushed a commit to Naseer-010/openvino that referenced this pull request Feb 18, 2026
### Details:
 - Make OCL context as singleton
- GenAI tries to change its behavior to multi ov::Core, while buffers
can be shared across ov::Core
 - We had two choices
   - explicit context sharing among multiple ov::Core
   - singleton OCL context
- The expectation is that single OCL context is created for one process.
So it should be fine to use singleton ocl context.
- Singleton OCL context was implemented with weak ptr. It was design
choice to release cl_context before plugin unload.
- As Plugin class owns cl_context through shared_ptr, singleton
cl_context will be freed when all Core instances are destructed.

### Tickets:
 - CVS-178537
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: CPP API OpenVINO CPP API bindings category: docs OpenVINO documentation category: GPU OpenVINO GPU plugin category: inference OpenVINO Runtime library - Inference category: JS API OpenVino JS API Bindings category: Python API OpenVINO Python bindings do_not_merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.