Skip to content

Commit 64e56ab

Browse files
authored
Merge pull request #4342 from yuanyao-nv/dev-10.8-staging
TensorRT 10.8-GA OSS Release
2 parents 97ff244 + 9443fc4 commit 64e56ab

File tree

266 files changed

+1147683
-1381
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

266 files changed

+1147683
-1381
lines changed

CHANGELOG.md

+24
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,29 @@
11
# TensorRT OSS Release Changelog
22

3+
## 10.8.0 GA - 2025-1-31
4+
Key Features and Updates:
5+
6+
- Demo changes
7+
- demoDiffusion
8+
- Added [Image-to-Image](demo/Diffusion#generate-an-image-guided-by-an-initial-image-and-a-text-prompt-using-flux) support for Flux-1.dev and Flux.1-schnell pipelines.
9+
- Added [ControlNet](demo/Diffusion#generate-an-image-guided-by-a-text-prompt-and-a-control-image-using-flux-controlnet) support for [FLUX.1-Canny-dev](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev) and [FLUX.1-Depth-dev](https://huggingface.co/black-forest-labs/FLUX.1-Depth-dev) pipelines. Native FP8 quantization is also supported for these pipelines.
10+
- Added support for ONNX model export only mode. See [--onnx-export-only](demo/Diffusion#https://gitlab-master.nvidia.com/TensorRT/Public/oss/-/tree/release/10.8/demo/Diffusion?ref_type=heads#use-separate-directories-for-individual-onnx-models).
11+
- Added FP16, BF16, FP8, and FP4 support for all Flux Pipelines.
12+
- Plugin changes
13+
- Added SM 100 and SM 120 support to bertQKVToContextPlugin. This enables demo/BERT on Blackwell GPUs.
14+
- Sample changes
15+
- Added a new `sampleEditableTimingCache` to demonstrate how to build an engine with the desired tactics by modifying the timing cache.
16+
- Deleted the `sampleAlgorithmSelector` sample.
17+
- Fixed `sampleOnnxMNIST` by updating the correct INT8 dynamic range.
18+
- Parser changes
19+
- Added support for `FLOAT4E2M1` types for quantized networks.
20+
- Added support for dynamic axes and improved performance of `CumSum` operations.
21+
- Fixed the import of local functions when their input tensor names aliased one from an outside scope.
22+
- Added support for `Pow` ops with integer-typed exponent values.
23+
- Fixed issues
24+
- Fixed segmentation of boolean constant nodes - [4224](https://github.com/NVIDIA/TensorRT/issues/4224).
25+
- Fixed accuracy issue when multiple optimization profiles were defined [4250](https://github.com/NVIDIA/TensorRT/issues/4250).
26+
327
## 10.7.0 GA - 2024-12-4
428
Key Feature and Updates:
529

README.md

+28-28
Original file line numberDiff line numberDiff line change
@@ -26,13 +26,13 @@ You can skip the **Build** section to enjoy TensorRT with Python.
2626
To build the TensorRT-OSS components, you will first need the following software packages.
2727

2828
**TensorRT GA build**
29-
* TensorRT v10.7.0.23
29+
* TensorRT v10.8.0.43
3030
* Available from direct download links listed below
3131

3232
**System Packages**
3333
* [CUDA](https://developer.nvidia.com/cuda-toolkit)
3434
* Recommended versions:
35-
* cuda-12.6.0 + cuDNN-8.9
35+
* cuda-12.8.0 + cuDNN-8.9
3636
* cuda-11.8.0 + cuDNN-8.9
3737
* [GNU make](https://ftp.gnu.org/gnu/make/) >= v4.1
3838
* [cmake](https://github.com/Kitware/CMake/releases) >= v3.13
@@ -73,25 +73,25 @@ To build the TensorRT-OSS components, you will first need the following software
7373
If using the TensorRT OSS build container, TensorRT libraries are preinstalled under `/usr/lib/x86_64-linux-gnu` and you may skip this step.
7474

7575
Else download and extract the TensorRT GA build from [NVIDIA Developer Zone](https://developer.nvidia.com) with the direct links below:
76-
- [TensorRT 10.7.0.23 for CUDA 11.8, Linux x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/10.7.0/tars/TensorRT-10.7.0.23.Linux.x86_64-gnu.cuda-11.8.tar.gz)
77-
- [TensorRT 10.7.0.23 for CUDA 12.6, Linux x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/10.7.0/tars/TensorRT-10.7.0.23.Linux.x86_64-gnu.cuda-12.6.tar.gz)
78-
- [TensorRT 10.7.0.23 for CUDA 11.8, Windows x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/10.7.0/zip/TensorRT-10.7.0.23.Windows.win10.cuda-11.8.zip)
79-
- [TensorRT 10.7.0.23 for CUDA 12.6, Windows x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/10.7.0/zip/TensorRT-10.7.0.23.Windows.win10.cuda-12.6.zip)
76+
- [TensorRT 10.8.0.43 for CUDA 11.8, Linux x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/10.8.0/tars/TensorRT-10.8.0.43.Linux.x86_64-gnu.cuda-11.8.tar.gz)
77+
- [TensorRT 10.8.0.43 for CUDA 12.8, Linux x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/10.8.0/tars/TensorRT-10.8.0.43.Linux.x86_64-gnu.cuda-12.8.tar.gz)
78+
- [TensorRT 10.8.0.43 for CUDA 11.8, Windows x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/10.8.0/zip/TensorRT-10.8.0.43.Windows.win10.cuda-11.8.zip)
79+
- [TensorRT 10.8.0.43 for CUDA 12.8, Windows x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/10.8.0/zip/TensorRT-10.8.0.43.Windows.win10.cuda-12.8.zip)
8080

8181

82-
**Example: Ubuntu 20.04 on x86-64 with cuda-12.6**
82+
**Example: Ubuntu 20.04 on x86-64 with cuda-12.8**
8383

8484
```bash
8585
cd ~/Downloads
86-
tar -xvzf TensorRT-10.7.0.23.Linux.x86_64-gnu.cuda-12.6.tar.gz
87-
export TRT_LIBPATH=`pwd`/TensorRT-10.7.0.23
86+
tar -xvzf TensorRT-10.8.0.43.Linux.x86_64-gnu.cuda-12.8.tar.gz
87+
export TRT_LIBPATH=`pwd`/TensorRT-10.8.0.43
8888
```
8989

90-
**Example: Windows on x86-64 with cuda-12.6**
90+
**Example: Windows on x86-64 with cuda-12.8**
9191

9292
```powershell
93-
Expand-Archive -Path TensorRT-10.7.0.23.Windows.win10.cuda-12.6.zip
94-
$env:TRT_LIBPATH="$pwd\TensorRT-10.7.0.23\lib"
93+
Expand-Archive -Path TensorRT-10.8.0.43.Windows.win10.cuda-12.8.zip
94+
$env:TRT_LIBPATH="$pwd\TensorRT-10.8.0.43\lib"
9595
```
9696

9797
## Setting Up The Build Environment
@@ -101,27 +101,27 @@ For Linux platforms, we recommend that you generate a docker container for build
101101
1. #### Generate the TensorRT-OSS build container.
102102
The TensorRT-OSS build container can be generated using the supplied Dockerfiles and build scripts. The build containers are configured for building TensorRT OSS out-of-the-box.
103103

104-
**Example: Ubuntu 20.04 on x86-64 with cuda-12.6 (default)**
104+
**Example: Ubuntu 20.04 on x86-64 with cuda-12.8 (default)**
105105
```bash
106-
./docker/build.sh --file docker/ubuntu-20.04.Dockerfile --tag tensorrt-ubuntu20.04-cuda12.6
106+
./docker/build.sh --file docker/ubuntu-20.04.Dockerfile --tag tensorrt-ubuntu20.04-cuda12.8
107107
```
108-
**Example: Rockylinux8 on x86-64 with cuda-12.6**
108+
**Example: Rockylinux8 on x86-64 with cuda-12.8**
109109
```bash
110-
./docker/build.sh --file docker/rockylinux8.Dockerfile --tag tensorrt-rockylinux8-cuda12.6
110+
./docker/build.sh --file docker/rockylinux8.Dockerfile --tag tensorrt-rockylinux8-cuda12.8
111111
```
112-
**Example: Ubuntu 22.04 cross-compile for Jetson (aarch64) with cuda-12.6 (JetPack SDK)**
112+
**Example: Ubuntu 22.04 cross-compile for Jetson (aarch64) with cuda-12.8 (JetPack SDK)**
113113
```bash
114-
./docker/build.sh --file docker/ubuntu-cross-aarch64.Dockerfile --tag tensorrt-jetpack-cuda12.6
114+
./docker/build.sh --file docker/ubuntu-cross-aarch64.Dockerfile --tag tensorrt-jetpack-cuda12.8
115115
```
116-
**Example: Ubuntu 22.04 on aarch64 with cuda-12.6**
116+
**Example: Ubuntu 22.04 on aarch64 with cuda-12.8**
117117
```bash
118-
./docker/build.sh --file docker/ubuntu-22.04-aarch64.Dockerfile --tag tensorrt-aarch64-ubuntu22.04-cuda12.6
118+
./docker/build.sh --file docker/ubuntu-22.04-aarch64.Dockerfile --tag tensorrt-aarch64-ubuntu22.04-cuda12.8
119119
```
120120

121121
2. #### Launch the TensorRT-OSS build container.
122122
**Example: Ubuntu 20.04 build container**
123123
```bash
124-
./docker/launch.sh --tag tensorrt-ubuntu20.04-cuda12.6 --gpus all
124+
./docker/launch.sh --tag tensorrt-ubuntu20.04-cuda12.8 --gpus all
125125
```
126126
> NOTE:
127127
<br> 1. Use the `--tag` corresponding to build container generated in Step 1.
@@ -132,38 +132,38 @@ For Linux platforms, we recommend that you generate a docker container for build
132132
## Building TensorRT-OSS
133133
* Generate Makefiles and build.
134134

135-
**Example: Linux (x86-64) build with default cuda-12.6**
135+
**Example: Linux (x86-64) build with default cuda-12.8**
136136
```bash
137137
cd $TRT_OSSPATH
138138
mkdir -p build && cd build
139139
cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=`pwd`/out
140140
make -j$(nproc)
141141
```
142-
**Example: Linux (aarch64) build with default cuda-12.6**
142+
**Example: Linux (aarch64) build with default cuda-12.8**
143143
```bash
144144
cd $TRT_OSSPATH
145145
mkdir -p build && cd build
146146
cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=`pwd`/out -DCMAKE_TOOLCHAIN_FILE=$TRT_OSSPATH/cmake/toolchains/cmake_aarch64-native.toolchain
147147
make -j$(nproc)
148148
```
149-
**Example: Native build on Jetson (aarch64) with cuda-12.6**
149+
**Example: Native build on Jetson (aarch64) with cuda-12.8**
150150
```bash
151151
cd $TRT_OSSPATH
152152
mkdir -p build && cd build
153-
cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=`pwd`/out -DTRT_PLATFORM_ID=aarch64 -DCUDA_VERSION=12.6
153+
cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=`pwd`/out -DTRT_PLATFORM_ID=aarch64 -DCUDA_VERSION=12.8
154154
CC=/usr/bin/gcc make -j$(nproc)
155155
```
156156
> NOTE: C compiler must be explicitly specified via CC= for native aarch64 builds of protobuf.
157157

158-
**Example: Ubuntu 22.04 Cross-Compile for Jetson (aarch64) with cuda-12.6 (JetPack)**
158+
**Example: Ubuntu 22.04 Cross-Compile for Jetson (aarch64) with cuda-12.8 (JetPack)**
159159
```bash
160160
cd $TRT_OSSPATH
161161
mkdir -p build && cd build
162-
cmake .. -DCMAKE_TOOLCHAIN_FILE=$TRT_OSSPATH/cmake/toolchains/cmake_aarch64.toolchain -DCUDA_VERSION=12.6 -DCUDNN_LIB=/pdk_files/cudnn/usr/lib/aarch64-linux-gnu/libcudnn.so -DCUBLAS_LIB=/usr/local/cuda-12.6/targets/aarch64-linux/lib/stubs/libcublas.so -DCUBLASLT_LIB=/usr/local/cuda-12.6/targets/aarch64-linux/lib/stubs/libcublasLt.so -DTRT_LIB_DIR=/pdk_files/tensorrt/lib
162+
cmake .. -DCMAKE_TOOLCHAIN_FILE=$TRT_OSSPATH/cmake/toolchains/cmake_aarch64.toolchain -DCUDA_VERSION=12.8 -DCUDNN_LIB=/pdk_files/cudnn/usr/lib/aarch64-linux-gnu/libcudnn.so -DCUBLAS_LIB=/usr/local/cuda-12.8/targets/aarch64-linux/lib/stubs/libcublas.so -DCUBLASLT_LIB=/usr/local/cuda-12.8/targets/aarch64-linux/lib/stubs/libcublasLt.so -DTRT_LIB_DIR=/pdk_files/tensorrt/lib
163163
make -j$(nproc)
164164
```
165165

166-
**Example: Native builds on Windows (x86) with cuda-12.6**
166+
**Example: Native builds on Windows (x86) with cuda-12.8**
167167
```powershell
168168
cd $TRT_OSSPATH
169169
mkdir -p build

VERSION

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
10.7.0.23
1+
10.8.0.43

cmake/modules/find_library_create_target.cmake

+9-3
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,16 @@
1818
macro(find_library_create_target target_name lib libtype hints)
1919
message(STATUS "========================= Importing and creating target ${target_name} ==========================")
2020
message(STATUS "Looking for library ${lib}")
21-
if (CMAKE_BUILD_TYPE STREQUAL "Debug")
22-
find_library(${lib}_LIB_PATH ${lib}${TRT_DEBUG_POSTFIX} HINTS ${hints} NO_DEFAULT_PATH)
21+
if(CMAKE_BUILD_TYPE STREQUAL "Debug")
22+
find_library(
23+
${lib}_LIB_PATH ${lib}${TRT_DEBUG_POSTFIX}
24+
HINTS ${hints}
25+
NO_DEFAULT_PATH)
2326
endif()
24-
find_library(${lib}_LIB_PATH ${lib} HINTS ${hints} NO_DEFAULT_PATH)
27+
find_library(
28+
${lib}_LIB_PATH ${lib}
29+
HINTS ${hints}
30+
NO_DEFAULT_PATH)
2531
find_library(${lib}_LIB_PATH ${lib})
2632
message(STATUS "Library that was found ${${lib}_LIB_PATH}")
2733
add_library(${target_name} ${libtype} IMPORTED)

cmake/modules/set_ifndef.cmake

+6-4
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,10 @@
1414
# See the License for the specific language governing permissions and
1515
# limitations under the License.
1616
#
17-
function (set_ifndef variable value)
18-
if(NOT DEFINED ${variable})
19-
set(${variable} ${value} PARENT_SCOPE)
20-
endif()
17+
function(set_ifndef variable value)
18+
if(NOT DEFINED ${variable})
19+
set(${variable}
20+
${value}
21+
PARENT_SCOPE)
22+
endif()
2123
endfunction()

0 commit comments

Comments
 (0)