Skip to content

Commit e0bec87

Browse files
committed
Update README for 22.06 release
1 parent 09809c2 commit e0bec87

File tree

1 file changed

+96
-0
lines changed

1 file changed

+96
-0
lines changed

RELEASE.md

+96
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
<!--
2+
# Copyright 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
#
4+
# Redistribution and use in source and binary forms, with or without
5+
# modification, are permitted provided that the following conditions
6+
# are met:
7+
# * Redistributions of source code must retain the above copyright
8+
# notice, this list of conditions and the following disclaimer.
9+
# * Redistributions in binary form must reproduce the above copyright
10+
# notice, this list of conditions and the following disclaimer in the
11+
# documentation and/or other materials provided with the distribution.
12+
# * Neither the name of NVIDIA CORPORATION nor the names of its
13+
# contributors may be used to endorse or promote products derived
14+
# from this software without specific prior written permission.
15+
#
16+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
17+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
19+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
20+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
21+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
22+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
23+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
24+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
25+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
26+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
27+
-->
28+
29+
# Release Notes for 2.23.0
30+
31+
## New Freatures and Improvements
32+
33+
* Auto-generated model configuration enables
34+
[dynamic batching](https://github.com/triton-inference-server/server/blob/r22.06/docs/model_configuration.md#default-max-batch-size-and-dynamic-batcher)
35+
in supported models by default.
36+
37+
* Python backend models now support
38+
[auto-generated model configuration](https://github.com/triton-inference-server/python_backend/tree/r22.06#auto_complete_config).
39+
40+
* [Decoupled API](https://github.com/triton-inference-server/server/blob/r22.06/docs/decoupled_models.md#python-model-using-python-backend)
41+
support in Python Backend model is out of beta.
42+
43+
* Updated I/O tensors
44+
[naming convention](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#special-conventions-for-pytorch-backend)
45+
for serving TorchScript models via PyTorch backend.
46+
47+
* Improvements to Perf Analyzer stability and profiling logic.
48+
49+
* Refer to the 22.06 column of the
50+
[Frameworks Support Matrix](https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html)
51+
for container image versions on which the 22.06 inference server container is based.
52+
53+
54+
## Known Issues
55+
56+
* Perf Analyzer stability criteria has been changed which may result in
57+
reporting instability for scenarios that were previously considered stable.
58+
This change has been made to improve the accuracy of Perf Analyzer results.
59+
If you observe this message, it can be resolved by increasing the
60+
`--measurement-interval` in the time windows mode or
61+
`--measurement-request-count` in the count windows mode.
62+
63+
* 22.06 is the last release that defaults to
64+
[TensorFlow version 1](https://github.com/triton-inference-server/tensorflow_backend/tree/r22.06#--backend-configtensorflowversionint).
65+
From 22.07 onwards Triton will change the default TensorFlow version to 2.X.
66+
67+
* Triton PIP wheels for ARM SBSA are not available from PyPI and pip will
68+
install an incorrect Jetson version of Triton for Arm SBSA.
69+
70+
The correct wheel file can be pulled directly from the Arm SBSA SDK image and
71+
manually installed.
72+
73+
* Traced models in PyTorch seem to create overflows when int8 tensor values are
74+
transformed to int32 on the GPU.
75+
76+
Refer to issue [pytorch#66930](https://github.com/pytorch/pytorch/issues/66930)
77+
for more information.
78+
79+
* Triton cannot retrieve GPU metrics with MIG-enabled GPU devices (A100 and A30).
80+
81+
* Triton metrics might not work if the host machine is running a separate DCGM
82+
agent on bare-metal or in a container.
83+
84+
* Running a PyTorch TorchScript model using the PyTorch backend, where multiple
85+
instances of a model are configured can lead to a slowdown in model execution
86+
due to the following PyTorch issue:
87+
[pytorch#27902](https://github.com/pytorch/pytorch/issues/27902).
88+
89+
* Starting from 22.02, the Triton container, which uses the 22.02 or above
90+
PyTorch container, will report an error during model loading in the PyTorch
91+
backend when using scripted models that were exported in the legacy format
92+
(using our 19.09 or previous PyTorch NGC containers corresponding to
93+
PyTorch 1.2.0 or previous releases).
94+
95+
To load the model successfully in Triton, you need to export the model again
96+
by using a recent version of PyTorch.

0 commit comments

Comments
 (0)