Skip to content

perf(autoware_tensorrt_plugins): keep plugin outputs stream-ordered#12556

Draft
mojomex wants to merge 4 commits intoautowarefoundation:mainfrom
mojomex:perf/trt-plugins-no-unnecessary-syncs
Draft

perf(autoware_tensorrt_plugins): keep plugin outputs stream-ordered#12556
mojomex wants to merge 4 commits intoautowarefoundation:mainfrom
mojomex:perf/trt-plugins-no-unnecessary-syncs

Conversation

@mojomex
Copy link
Copy Markdown
Contributor

@mojomex mojomex commented May 7, 2026

Summary

Builds on the allocation-free variant by removing unnecessary synchronization and keeping plugin outputs stream-ordered.

This draft PR corresponds to the benchmarked variant ptv3-t18-no-thrust-no-alloc-no-sync-13f3672a0-20260506.
All PRs in this cohort target main; each later PR contains the changes benchmarked in the earlier ones.

Cohort

Benchmarks

Source report: reports/2026-05-07_22-20-12/report.md

Total Latency Summary

Variant Measurement CPU mean (ms) CPU p95 (ms) CPU faster vs baseline GPU mean (ms) GPU p95 (ms) GPU faster vs baseline Mean voxels
ptv3-t18 series (7 x 50) 29.117 31.755 +0.0% 28.020 30.598 +0.0% 120572
ptv3-t18-no-thrust-c8f76ed-20260506 series (7 x 50) 27.559 28.502 +5.7% 26.416 27.306 +6.1% 120572
ptv3-t18-no-thrust-no-alloc-e9515b790-20260506 series (7 x 50) 26.874 28.863 +8.3% 25.741 27.665 +8.9% 120572
ptv3-t18-no-thrust-no-alloc-no-sync-13f3672a0-20260506 series (7 x 50) 26.278 27.471 +10.8% 25.140 26.288 +11.5% 120572
ptv3-t18-no-thrust-no-alloc-no-sync-maxnumel-47bf5656f-20260506 series (7 x 50) 26.214 26.642 +11.1% 25.084 25.488 +11.7% 120572
ptv3-t18-no-thrust-no-alloc-no-sync-maxnumel-maxauxstreams1 series (7 x 50) 25.378 25.984 +14.7% 24.230 24.792 +15.6% 120572
ptv3-t18-no-thrust-no-alloc-no-sync-maxnumel-maxauxstreams3 series (7 x 50) 26.809 28.083 +8.6% 25.614 26.849 +9.4% 120572

Relative Performance

Relative performance graph

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

@github-actions github-actions Bot added type:documentation Creating or refining documentation. (auto-assigned) component:perception Advanced sensor data processing and environment understanding. (auto-assigned) component:sensing Data acquisition from sensors, drivers, preprocessing. (auto-assigned) component:planning Route planning, decision-making, and navigation. (auto-assigned) labels May 7, 2026
@github-actions github-actions Bot added component:control Vehicle control algorithms and mechanisms. (auto-assigned) component:system System design and integration. (auto-assigned) component:vehicle Vehicle-specific implementations, drivers, packages. (auto-assigned) type:ci Continuous Integration (CI) processes and testing. (auto-assigned) component:common Common packages from the autoware-common repository. (auto-assigned) component:simulation Virtual environment setups and simulations. (auto-assigned) component:evaluator Evaluation tools for planning, localization etc. (auto-assigned) labels May 7, 2026
@mojomex mojomex force-pushed the perf/trt-plugins-no-unnecessary-syncs branch from 6fc4984 to e20caa4 Compare May 7, 2026 14:24
@github-actions github-actions Bot removed type:documentation Creating or refining documentation. (auto-assigned) component:sensing Data acquisition from sensors, drivers, preprocessing. (auto-assigned) component:planning Route planning, decision-making, and navigation. (auto-assigned) component:control Vehicle control algorithms and mechanisms. (auto-assigned) component:system System design and integration. (auto-assigned) component:vehicle Vehicle-specific implementations, drivers, packages. (auto-assigned) type:ci Continuous Integration (CI) processes and testing. (auto-assigned) component:common Common packages from the autoware-common repository. (auto-assigned) labels May 7, 2026
@github-actions github-actions Bot removed component:simulation Virtual environment setups and simulations. (auto-assigned) component:evaluator Evaluation tools for planning, localization etc. (auto-assigned) labels May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:perception Advanced sensor data processing and environment understanding. (auto-assigned)

Projects

Status: To Triage

Development

Successfully merging this pull request may close these issues.

1 participant