-
Notifications
You must be signed in to change notification settings - Fork 34
Add trt decoder #307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add trt decoder #307
Conversation
bd14f16 to
c34be87
Compare
- Add trt_decoder class implementing TensorRT-accelerated inference - Support both ONNX model loading and pre-built engine loading - Include precision configuration (fp16, bf16, int8, fp8, tf32, best) - Add hardware platform detection for capability-based precision selection - Implement CUDA memory management and stream-based execution - Add Python utility script for ONNX to TensorRT engine conversion - Update CMakeLists.txt to build TensorRT decoder plugin - Add comprehensive parameter validation and error handling
c34be87 to
9e97e26
Compare
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
…trix) Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
…ecoder model, added to unittest Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
…some of the test cases (more to come) Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
|
/ok to test b645807 |
Signed-off-by: Scott Thornton <[email protected]>
|
/ok to test 8e0ab06 |
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
|
/ok to test e9825b3 |
Signed-off-by: Scott Thornton <[email protected]>
|
/ok to test c2c61db |
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
|
/ok to test 663ba48 |
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
Signed-off-by: Scott Thornton <[email protected]>
|
/ok to test 32e8e64 |
Signed-off-by: Scott Thornton <[email protected]>
|
/ok to test 30da7ce |
Signed-off-by: Scott Thornton <[email protected]>
|
/ok to test 5389216 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the hard work on this @wsttiger! If anyone has any additional review comments, we can address them post-merge.
| message(STATUS "TensorRT ONNX parser: ${TENSORRT_ONNX_LIBRARY}") | ||
| target_compile_definitions(${MODULE_NAME} PRIVATE TENSORRT_AVAILABLE) | ||
| else() | ||
| message(WARNING "TensorRT not found. Building decoder without TensorRT support.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the build succeeds if TensorRT is installed. At least it doesn't on my machine. Is the whole build supposed to fail if TRT is not found? I would advocate for making this a top-level CMake flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this resolved by #331?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this resolved by #331?
Sort of. #331 allows the user to disable the TRT decoder at cmake time by specifically disabling it, but if they leave it enabled at cmake time (which is the default), then there is still a build failure about missing include files despite this message making it sound like it should just build the decoder without TensorRT support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Makes sense now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @@ -0,0 +1,3 @@ | |||
| version https://git-lfs.github.com/spec/v1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this .onnx file for CI testing only? Or are the users supposed to be downloading this file as well?
| - name: Install TensorRT (arm64) | ||
| if: matrix.platform == 'arm64' | ||
| run: | | ||
| apt-cache search tensorrt | awk '{print "Package: "$1"\nPin: version *+cuda13.0\nPin-Priority: 1001\n"}' | tee /etc/apt/preferences.d/tensorrt-cuda13.0.pref > /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is installing CUDA 13 regardless of what ${{matrix.cuda_version}} is, so it is installing it in our 12.6 images, too. I believe (?) this should not be installed for CUDA 12.6 because we are not supporting CUDA 12 + ARM for this, right?
Point of reference: https://github.com/NVIDIA/cudaqx/actions/runs/18989982159/job/54240883357#step:12:41 shows the CUDA 13 version being installed in AR CUDA 12.6. (I found this because our GitLab pipeline is broken for ARM right now, and I am still investigating.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add TensorRT Decoder Plugin for Quantum Error Correction
Overview
This PR introduces a new TensorRT-based decoder plugin for quantum error correction, leveraging NVIDIA TensorRT for accelerated neural network inference in QEC applications.
Key Features
Technical Implementation
trt_decoderimplementing thedecoderinterface with TensorRT backendFiles Added/Modified
libs/qec/include/cudaq/qec/trt_decoder_internal.h- Internal API declarationslibs/qec/lib/decoders/plugins/trt_decoder/trt_decoder.cpp- Main decoder implementationlibs/qec/lib/decoders/plugins/trt_decoder/CMakeLists.txt- Plugin build configurationlibs/qec/python/cudaq_qec/plugins/tensorrt_utils/build_engine_from_onnx.py- Python utilitylibs/qec/unittests/test_trt_decoder.cpp- Comprehensive unit testsTesting
Usage Example
Dependencies
Performance Benefits
This implementation provides a production-ready TensorRT decoder plugin that can significantly accelerate quantum error correction workflows while maintaining compatibility with the existing CUDA-Q QEC framework.