Skip to content

NV TensorRT RTX EP - initial commit #24456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

ankan-ban
Copy link

@ankan-ban ankan-ban commented Apr 17, 2025

New EP - currently based on existing TensorRT EP but meant to be used on RTX GPUs with a lean version of TensorRT.

Description

Adding a new EP based on TensorRT EP. This is going to use a special version of TensorRT optimized for RTX GPUs. In the future we plan to make changes to the EP to streamline it further (e.g, get rid of dependency on CUDA EP completely).

Motivation and Context

The new TensorRT for RTX is going to have:

  1. Much smaller footprint
  2. Much faster model compile/load times.
  3. Better usability in terms of use of cached models across multiple RTX GPUs.

This effort is also targeting WCR ML workflows.

ankan-ban and others added 2 commits April 17, 2025 17:40
New EP - currently based on existing TensorRT EP but meant to be used on RTX GPUs with a lean version of TensorRT.
@ankan-ban ankan-ban marked this pull request as draft April 17, 2025 13:54
@@ -0,0 +1,48 @@
#pragma once

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,597 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,261 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,171 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,191 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,20 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,45 @@
#ifndef ONNXRUNTIME_NV_PROVIDER_OPTIONS_INTERNAL_H

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,418 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
NvProviderFactory(const NvExecutionProviderInfo& info) : info_{info} {}
~NvProviderFactory() override {}

std::unique_ptr<IExecutionProvider> CreateProvider() override;
Copy link
Contributor

@adrianlizarraga adrianlizarraga Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to be able to use the new Compile API, could you please also add in implementation of the new CreateProvider() overload that takes in OrtSessionOptions and OrtLogger?

Note that the EP provider options are added to the session options configs with a new prefix: "ep.<lowercase_ep_name>.<NV_PROVIDER_OPTION_KEY>".

Here's an example implementation:

  std::unique_ptr<IExecutionProvider> CreateProvider(const OrtSessionOptions& session_options,
                                                     const OrtLogger& session_logger) override {
    const ConfigOptions& config_options = session_options.GetConfigOptions();
    const std::unordered_map<std::string, std::string>& config_options_map = config_options.GetConfigOptionsMap();

    // The implementation of the SessionOptionsAppendExecutionProvider C API function automatically adds EP options to
    // the session option configurations with the key prefix "ep.<lowercase_ep_name>.".
    // We extract those EP options to create a new "provider options" key/value map.
    std::string lowercase_ep_name = kNvTensorRTRTXExecutionProvider;
    std::transform(lowercase_ep_name.begin(), lowercase_ep_name.end(), lowercase_ep_name.begin(), [](unsigned char c) {
      return static_cast<char>(std::tolower(c));
    });

    std::unordered_map<std::string, std::string> provider_options;
    std::string key_prefix = "ep.";
    key_prefix += lowercase_ep_name;
    key_prefix += ".";

    for (const auto& [key, value] : config_options_map) {
      if (key.rfind(key_prefix, 0) == 0) {
        provider_options[key.substr(key_prefix.size())] = value;
      }
    }

    // TODO: Create a NvExecutionProviderInfo struct from config_options and provider_options:
    NvExecutionProviderInfo nv_info = /*...*/;

    auto ep = std::make_unique<NvExecutionProvider>(nv_info);
    ep->SetLogger(reinterpret_cast<const logging::Logger*>(&session_logger));
    return ep;
  }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also recommend use of the generic SessionOptionsAppendExecutionProvider C API function, which automatically adds provider options to session options configs map.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you please provide the use case of
CreateProvider(const OrtSessionOptions& session_options, const OrtLogger& session_logger) override

Unload the model once it is no longer needed.

Bug: 5225623
@@ -0,0 +1,141 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
* \snippet{doc} snippets.dox OrtStatus Return Value
* \since Version 1.21
*/
ORT_API2_STATUS(SessionOptionsAppendExecutionProvider_Nv_TensorRT_RTX,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this require new provider specific APIs vs using SessionOptionsAppendExecutionProvider?
@adrianlizarraga

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NvTensorRT_RTX is created in a standalone shared dll. So we require the EP-specific API.

@jywu-msft
Copy link
Member

please address lintrunner failure
(clangformat) https://github.com/microsoft/onnxruntime/actions/runs/14519273916/job/40745967030?pr=24456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants