Skip to content

Latest commit

 

History

History
41 lines (33 loc) · 2.83 KB

configuring-a-recommended-accelerator-for-serving-runtimes.adoc

File metadata and controls

41 lines (33 loc) · 2.83 KB

Configuring a recommended accelerator for serving runtimes

To help you indicate the most suitable accelerators to your data scientists, you can configure a recommended accelerator tag for your serving runtimes.

Prerequisites
  • You have logged in to {productname-short} as a user with {productname-short} administrator privileges.

  • You have enabled GPU support in {productname-short}. This includes installing the Node Feature Discovery operator and NVIDIA GPU Operators. For more information, see Installing the Node Feature Discovery operator and Enabling NVIDIA GPUs.

Procedure
  1. From the {productname-short} dashboard, click SettingsServing runtimes.

    The Serving runtimes page opens and shows the model-serving runtimes that are already installed and enabled in your {productname-short} deployment. By default, the OpenVINO Model Server runtime is pre-installed and enabled in {productname-short}.

  2. Edit your custom runtime that you want to add the recommended accelerator tag to, click the action menu (⋮) and select Edit.

    A page with an embedded YAML editor opens.

    Note
    You cannot directly edit the OpenVINO Model Server runtime that is included in {productname-short} by default. However, you can clone this runtime and edit the cloned version. You can then add the edited clone as a new, custom runtime. To do this, click the action menu beside the OpenVINO Model Server and select Duplicate.
  3. In the editor, enter the YAML code to apply the annotation opendatahub.io/recommended-accelerators. The excerpt in this example shows the annotation to set a recommended tag for an NVIDIA GPU accelerator:

    metadata:
    	annotations:
    		opendatahub.io/recommended-accelerators: '["nvidia.com/gpu"]'
  4. Click Update.

Verification
  • When your data scientists select an accelerator with a specific serving runtime, a tag appears next to the corresponding accelerator indicating its compatibility.