Uses changelog conventions. Uses semantic versioning.
- Changelogs are for humans, not machines.
- There should be an entry for every single version.
- The same types of changes should be grouped.
- Versions and sections should be linkable.
- The latest version comes first.
- The release date of each version is displayed.
- Mention whether you follow Semantic Versioning.
Addedfor new features.Changedfor changes in existing functionality.Deprecatedfor soon-to-be removed features.Removedfor now removed features.Fixedfor any bug fixes.Securityin case of vulnerabilities.
- Deprecated max workspace size flag to memory pool limits for TensorRT
- Added t5-11b support
- Changed T5 demo kv cache TRT memory organization to avoid D2D copy
- Added beam search support for GPT2 demo
- Added KV cache support for GPT2 demo
- Fixed perplexity calculation array size out of max_length
- Fixed trt KV cache engine profile to only accept input_length = 1
- Fixed external onnx weight file name overwrite issue
- Added beam search support for T5 demo
- Added KV cache support for T5 demo
- Added perplexity calculation for all samples
- Added precision override to checkpoints.
- Fixed TensorRT BART checkpoint not working.
- Added beam search support for BART
- Added notebooks for BART demo
- Enabled flexible control on (a) percentile latency reports (b) engine building profile other than standard maximum input/output length config
- Added KV cache support for BART demo
- Added BART demo
- Added
benchmarkaction to T5 frameworks/onnxrt and GPT2 frameworks/trt for performance benchmarking. It uses random inputs with fixed lengths and disables early stopping such that we can compare the performance with other frameworks. - Added
batch_size > 1support to GPT2 trt sample.
- Added
benchmarkaction to T5 trt for performance benchmarking. It uses random inputs with fixed lengths and disables early stopping such that we can compare the performance with other frameworks.
- Added
-oor--save-output-fpathwhich saves a pickled version of theNetworkResultobject. Useful for testing.
- Added initial working example of HF samples and notebooks.