For more details, see the 10.15 GA release notes
- Added support for
RotaryEmbedding,RMSNormalizationandTensorScatterfor improved LLM model support - Added more specialized quantization ops for models quantized through TensorRT ModelOptimizer.
- Added
kREPORT_CAPABILITY_DLAflag to enable per-node validation when building DLA engines through TensorRT. - Added
kENABLE_PLUGIN_OVERRIDEflag to enable TensorRT plugin override for nodes that share names with user plugins. - Improved error reporting for models with multiple subgraphs, such as
LooporScannodes.
For more details, see the 10.14 GA release notes
- Added support for the
Attentionoperator - Improved refit for
ConstantOfShapenodes
For more details, see the 10.13 GA release notes
- Decreased memory usage when importing models with external weights
- Added
loadModelProto,loadInitializerandparseModelProtoAPIs for IParser. These APIs are meant to be used to load user initializers when parsing ONNX models. - Added
loadModelProto,loadInitializerandrefitModelProtoAPIs for IParserRefitter. These APIs are meant to be used to load user initializers when refitting ONNX models. - Deprecated
IParser::parseWithWeightDescriptors. - Unmarked
Protobufas a required dependency for building. If not found the ONNX submodule will install.
For more details, see the 10.12 GA release notes
- Added support for integer-typed base tensors for
Powoperations - Added support for custom
MXFP8quantization operations - Added support for ellipses, diagonal, and broadcasting in
Einsumoperations
For more details, see the 10.11 GA release notes
- Added
kENABLE_UINT8_AND_ASYMMETRIC_QUANTIZATION_DLAparser flag to enable UINT8 asymmetric quantization on engines targeting DLA - Removed restriction that inputs to
RandomNormalLikeandRandomUniformLikemust be tensors - Clarified limitations of scan outputs for
Loopnodes - Updated ONNX version to
1.18
For more details, see the 10.10 GA release notes
- Cleaned up log spam when the ONNX network contained a mixture Plugins and LocalFunctions
- UINT8 constants are now properly imported for QuantizeLinear & DequantizeLinear nodes
- Plugin fallback importer now also reads its namespace from a Node's domain field
For more details, see the 10.9 GA release notes
- Added support for Python AOT plugins
- Added support for opset 21 GroupNorm
- Fixed support for opset 18+ ScatterND
For more details, see the 10.8 GA release notes
- Added support for
FLOAT4E2M1types for quantized networks - Added support for dynamic axes and improved performance of
CumSumoperations - Fixed the import of local functions when their input tensor names aliased one from an outside scope
- Added support for
Powops with integer-typed exponent values
For more details, see the 10.7 GA release notes
- Now prioritizes using plugins over local functions when a corresponding plugin is available in the registry
- Added dynamic axes support for
SqueezeandUnsqueezeoperations - Added support for parsing mixed-precision
BatchNormalizationnodes in strongly-typed mode
For more details, see the 10.6 GA release notes
- Updated ONNX submodule version to 1.17.0
- Fix issue where conditional layers were incorrectly being added
- Updated local function metadata to contain more information
- Added support for parsing nodes with Quickly Deployable Plugins
- Fixed handling of optional outputs
For more details, see the 10.5 GA release notes.
- Added support for real-valued
STFToperations - Improved error handling in
IParser
For more details, see the 10.4 GA release notes.
- Added support for tensor
axesforPadoperations - Added support for
BlackmanWindow,HammingWindow, andHannWindowoperations - Improved error handling in
IParserRefitter - Fixed kernel shape inference in multi-input convolutions
For more details, see the 10.3 GA release notes.
- Added support for tensor
axesinputs forSlicenodes - Updated
ScatterElementsimporter to use an updated plugin
For more details, see the 10.2 GA release notes.
- Improved error handling with new macros and classes
- Minor changes to op importers for
GRUandSqueeze
For more details, see the 10.1 GA release notes.
- Added
supportsModelV2API - Added support for
DeformConvoperation - Added support for
PluginV3TensorRT Plugins - Marked all IParser and IParserRefitter APIs as
noexcept - Shape inputs can be passed to custom ops supported by
IPluginV3-based plugins by indicating the input indices to be interpreted as shape inputs by a node attribute namedtensorrt_plugin_shape_input_indices.
For more details, see the 10.0 GA release notes.
- Added support for building with with
protobuf-lite - Fixed issue when parsing and refitting models with nested
BatchNormalizationnodes - Added support for empty inputs in custom plugin nodes
For more details, see the 10.0 EA release notes.
- Added new class
IParserRefitterthat can be used to refit a TensorRT engine with the weights of an ONNX model kNATIVE_INSTANCENORMis now set to ON by default- Added support for
IPluginV3interfaces from TensorRT - Added support for
INT4quantization - Added support for the
reductionattribute inScatterElements - Added support for
wrappadding mode inPad
For more details, see the 9.3 GA release notes for the fixes since 9.2 GA.
- Added native support for
INT32andINT64types forArgMinandArgMaxnodes - Fixed check for valid
zero_pointvalues inQuantizeLinearandDequantizeLinearnodes
For more details, see the 9.2 GA release notes for the fixes since 9.1 GA.
- Added support for
Hardmax - Fixed type inference for few operators to use native ONNX types
For more details, see the 9.1 GA release notes for the fixes since 9.0 GA.
- Added new
ErrorCodeenums to improve error logging - Added new members to
IParserErrorto improve error logging - Added static checkers when parsing nodes, resulting better reporting of errors
For more details, see the 9.0 GA release notes for the fixes since 9.0 EA.
- Added support for FP8 and BF16 datatypes.
- Fixed a bug that previously caused
Ifnodes to fail import due to branch output size mismatch - Improved support for importing ONNX Local Functions
For more details, see the 9.0 EA release notes for the fixes since 8.6 GA.
- Added support for INT64 data type. The ONNX parser no longer automatically casts INT64 to INT32.
- Added support for ONNX local functions when parsing ONNX models with the ONNX parser.
- Breaking API Change: In TensorRT 9.0, due to the introduction of INT64 as a supported data type, ONNX models with INT64 I/O require INT64 bindings. Note that prior to this release, such models required INT32 bindings.
- Updated ONNX submodule to v1.14.0.
For more details, see the 8.6 GA release notes for the fixes since 8.6 EA.
- Renamed
kVERSION_COMPATIBLEflag tokNATIVE_INSTANCENORM - Added support for N-D
Trilu - Removed old LSTM importer
- Updated ONNX submodule to v1.13.1.
For more details, see the 8.6 EA release notes for new features added in TensorRT 8.6.
- Added support for
GroupNormalization,LayerNormalization,IsInfoperations - Added support for INT32 input types for
Argmin,Argmax, andTopK - Added support for
ReverseSequenceoperators with dynamic shapes - Added support for
TopKoperators with dynamicKvalues - Added
OnnxParserFlagenum andsetFlaginterfaces to the ONNX parser to modify the default parsing behavior - Added metadata tracking, now ONNX node metadata will be embedded into TensorRT layers
- All cast operations will now use the new
CastLayerover the perviousIdentityLayer.
For more details, see the 8.5 GA release notes for new features added in TensorRT 8.5
- Added the
RandomNormal,RandomUniform,MeanVarianceNormalization,RoiAlign,Mod,Trilu,GridSampleandNonZerooperations - Added native support for the
NonMaxSuppressionoperator - Added support for importing ONNX networks with
UINT8I/O types
- Fixed an issue with output padding with 1D deconv
- Fixed an issue when flattening 1D tensors
- Fixed an issue when parsing String attributes from TRT plugins
- Fixed an issue when importing
Ifsubgraphs with shared initializer names - Fixied an issue when importing
Loopsubgraphs withINT_MAXtrip counts
- Removed
onnx2trtbinary. See the README.md for alternative binaries to run ONNX model with TensorRT.
- Updated TensorRT version to 8.4.2
- Updated ONNX submodule version to 1.12
- Updated operators support documentation
- Fixed handling of no-op
Flattenoperations - Fixed
allowZerologic in Reshape operation
- Deprecated
onnx2trtbinary. This will be removed in the next release of TensorRT.
For more details, see the 8.4 GA release notes for new features added in TensorRT 8.4
- Added native FP16 support for importing and manipulating FP16 initializers
- Added support for
Shrink - Added support for
Xor - Added dynamic shape support for
ArgMaxandArgMin - Added dynamic shape support for
Rangefor floating point types
- Fixed an issue in tensor name scoping in ONNX models with nested subgraphs
- Fixed misc issues when dealing with empty tensors
- Fixed the operations in the
Celuimporter function - Removed unnecessary reshapes in the
GEMMimporter function
See the 8.2 EA release notes for new features added in TensorRT 8.2.
- Removed duplicate constant layer checks that caused some performance regressions
- Fixed expand dynamic shape calculations
- Added parser-side checks for Scatter layer support
- Added support for the following ONNX operators:
- Einsum
- IsNan
- GatherND
- Scatter
- ScatterElements
- ScatterND
- Sign
- Round
- Updated
GatherandGatherElementsimplementations to natively support negative indices - Updated
Padlayer to support ND padding, along withedgeandreflectpadding mode support - Updated
Iflayer with general performance improvements.
- Rehauled resize operator, now fully supporting the following modes:
- Coordinate Transformation modes:
half_pixel,pytorch_half_pixel,tf_half_pixel_for_nn,asymmetric, andalign_corners - Modes:
nearest,linear - Nearest Modes:
floor,ceil,round_prefer_floor,round_prefer_ceil
- Coordinate Transformation modes:
- QuantizeLinear/DequantizeLinear updates:
- Added support for tensor scales
- Added support for per-axis quantization
- Added support for multi-input ConvTranpose
- Added support for generic 2D padding
- Added experimental support for
NonMaxSuppression
- Moved
RefitMapAPI to core TensorRT. - Added Datatype column to operators.md
- Added library only build target #659
- Added support for negative gather indices #681
- Added support for
DOUBLE-typed inputs and weights through downcast to float #674 - Added support for optional plugin fields in FallbackPlugin path #676
- Updated license #657
- Fixed index offset calculation in GatherElements #675
- Clarified dynamic shape support for ReverseSequence
- Added opset13 support for
SoftMax,LogSoftmax,Squeeze, andUnsqueeze - Added support for the
EyeLikeoperator - Added support for the
GatherElementsoperator
- Added support for the
ReverseSequenceoperator #590 - Updated
parse()andsupportsModel()API calls with an optionalmodel_pathparameter to support models with external weights #621 - Added support for the
Celuoperator - Added support for the
CumSumoperator - Added support for the
LessOrEqualoperator - Added support for the
LpNormalizationoperator - Added support for the
LpPooloperator - Added support for the
GreaterOrEqualoperator - Added support for dynamic inputs in
onnx_tensorrtpython backend - Added FAQ section for commonly asked questions
- Fixed relative path imports for models with external weights [#619]#619
- Fixed importing loops operators with no loop-carried depedencies #619
- Worked around unsupported BOOL concats through casting #620
- Fixed compilation error with GCC9 #568
- Removed
onnx_tensorrt/config.pyas it is no longer needed
- Added
setup.pyto properly installonnx_tensorrtpython backend - Added 4D transpose for ONNX weights #557
- Fixed slice computations for large slices #558
- Added support for parsing large models with external data
- Added API for interfacing with TensorRT's refit feature
- Updated
onnx_tensorrtbackend to support dynamic shapes - Added support for 3D instance normalizations #515
- Improved clarity on the resize modes TRT supports #512
- Added Changelog
- Unified docker usage between ONNX-TensorRT and TensorRT.
- Removed deprecated docker files.
- Removed deprecated
setup.py.