Skip to content

feat(autoware_cuda_pointcloud_preprocessor): pointcloud concatenation #10300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

knzo25
Copy link
Contributor

@knzo25 knzo25 commented Mar 19, 2025

Description

NOTE: #10298 needs to be merged before this PR
The beforementioned PR templated the pointcloud concatenation, which this PR exploits to implement the cuda version is as little code duplication as possible.

This is part of the series of PR related to #9722

List of PRs:

Depending on your machine and how many nodes are in a container, the following branch may also be required:
https://github.com/knzo25/launch_ros/tree/fix/load_composable_node
There seems to be a but in ROS where if you send too many services at once some will be lost and ros_launch can not handle that.

Related links

Parent Issue:

How was this PR tested?

The sensing/perception pipeline was tested until centerpoint for TIER IV's taxi using the logging simulator.
The following tests were executed in a laptop equipped with a RTX 4060 (laptop) GPU and a Intel(R) Core(TM) Ultra 7 165H (22 cores)

Node / processing time [ms] Current PR
/sensing/lidar/top/crop_box_filter_self/debug/processing_time_ms 5.81 N/A
/sensing/lidar/top/crop_box_filter_mirror/debug/processing_time_ms 4.59 N/A
/sensing/lidar/top/distortion_corrector/debug/processing_time_ms 10.96 N/A
/sensing/lidar/top/ring_outlier_filter/debug/processing_time_ms 10.69 N/A
/sensing/lidar/top/cuda_pointcloud_preprocessor/debug/processing_time_ms N/A 3.08
(2.03 are H->D copies)
/sensing/lidar/concatenate_data_synchronizer/debug/processing_time_ms 7.83 0.70
Total 38.8 3.78

10.26 speedup!

Notes for reviewers

The main branch that I used for development is feat/cuda_acceleration_and_transport_layer2.
However, the changes were too big so I split the PRs. That being said, development, if any will still be on that branch (and then cherrypicked to the respective PRs), and the review changes will be cherrypicked into the development branch.

Notes for reviewers

None.

Interface changes

None.

Effects on system behavior

None.

@knzo25 knzo25 requested a review from amadeuszsz March 19, 2025 01:17
@knzo25 knzo25 self-assigned this Mar 19, 2025
@github-actions github-actions bot added type:documentation Creating or refining documentation. (auto-assigned) component:sensing Data acquisition from sensors, drivers, preprocessing. (auto-assigned) labels Mar 19, 2025
Copy link

github-actions bot commented Mar 19, 2025

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

Copy link
Contributor

@amadeuszsz amadeuszsz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your continued work on CUDA integration into pointcloud preprocessor!

  • Apart from comments, there is an error while passing host msgs to concatenated pointcloud:
[cuda_concatenate_and_time_sync_node-1] [WARN 1742948909.196968462] [cuda_concatenate_and_time_sync_node]: The compatible callback was called. This results in a performance loss. This behavior is probably not intended or a temporal measure (compatibleCallback() at /workspace/src/universe/external/cuda_blackboard/src/cuda_blackboard_subscriber.cpp:121)
[cuda_concatenate_and_time_sync_node-1] terminate called after throwing an instance of 'std::runtime_error'
[cuda_concatenate_and_time_sync_node-1]   what():  cudaErrorIllegalAddress (700)@/workspace/src/universe/external/cuda_blackboard/include/cuda_blackboard/cuda_unique_ptr.hpp#L50: an illegal memory access was encountered

If this node is not supposed to work with host messages, please address this use case.

  • Passing cuda messages results with:
[cuda_pointcloud_preprocessor_node-3] [INFO 1742955625.198786715] [right.cuda_pointcloud_preprocessor]: Negotiating (negotiate() at /workspace/src/universe/external/negotiated/negotiated/src/negotiated_publisher.cpp:610)                                                       
[cuda_pointcloud_preprocessor_node-1] [INFO 1742955625.200537233] [top.cuda_pointcloud_preprocessor]: Negotiating (negotiate() at /workspace/src/universe/external/negotiated/negotiated/src/negotiated_publisher.cpp:610)                                                         
[cuda_pointcloud_preprocessor_node-2] [INFO 1742955625.202174008] [left.cuda_pointcloud_preprocessor]: Negotiating (negotiate() at /workspace/src/universe/external/negotiated/negotiated/src/negotiated_publisher.cpp:610)                                                        
[cuda_pointcloud_preprocessor_node-3] [INFO 1742955625.275429451] [right.cuda_pointcloud_preprocessor]: Negotiating (negotiate() at /workspace/src/universe/external/negotiated/negotiated/src/negotiated_publisher.cpp:610)                                                       
[cuda_pointcloud_preprocessor_node-3] [INFO 1742955625.275482867] [right.cuda_pointcloud_preprocessor]: Could not negotiate (negotiate() at /workspace/src/universe/external/negotiated/negotiated/src/negotiated_publisher.cpp:633)                                               
[cuda_pointcloud_preprocessor_node-1] [INFO 1742955625.280224024] [top.cuda_pointcloud_preprocessor]: Negotiating (negotiate() at /workspace/src/universe/external/negotiated/negotiated/src/negotiated_publisher.cpp:610)                                                         
[cuda_pointcloud_preprocessor_node-1] [INFO 1742955625.280280914] [top.cuda_pointcloud_preprocessor]: Could not negotiate (negotiate() at /workspace/src/universe/external/negotiated/negotiated/src/negotiated_publisher.cpp:633)                                                 
[cuda_pointcloud_preprocessor_node-2] [INFO 1742955625.280701652] [left.cuda_pointcloud_preprocessor]: Negotiating (negotiate() at /workspace/src/universe/external/negotiated/negotiated/src/negotiated_publisher.cpp:610)                                                        
[cuda_pointcloud_preprocessor_node-2] [INFO 1742955625.280758089] [left.cuda_pointcloud_preprocessor]: Could not negotiate (negotiate() at /workspace/src/universe/external/negotiated/negotiated/src/negotiated_publisher.cpp:633)
  • I tried to check how it works, but due to the problems mentioned above it is impossible. I cherry-picked commits from parent PR, but I'm not sure if my custom launch file is correct. Could you please update "Notes for reviewers" section and prepare launch file for reviewing purposes which:
    • Run nebula.
    • Run cuda pointcloud preprocessor for top, right and left sensor.
    • Run cuda concatenate.

Of course I did it, but seems it doesn't work.

  • Slightly out of scope, but I see some high delays for cuda pointcloud preprocessor, is this because of how cuda allocates resources? If so, are we OK to just ignore it?
    Screenshot from 2025-03-26 09-18-25

knzo25 added 7 commits March 28, 2025 19:32
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
…e as the non-gpu node

Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
@knzo25 knzo25 requested a review from amadeuszsz March 28, 2025 10:37
@knzo25
Copy link
Contributor Author

knzo25 commented Mar 28, 2025

@amadeuszsz
As discussed in person, the data you showed me was simply not covered by the algorithm (the problem also appeared in the base implementation). Can you update your comments based on this new evidence?

Copy link
Contributor

@amadeuszsz amadeuszsz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@knzo25 knzo25 added the run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) label Apr 22, 2025
knzo25 added 5 commits April 22, 2025 14:04
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Copy link

codecov bot commented Apr 22, 2025

Codecov Report

Attention: Patch coverage is 0% with 140 lines in your changes missing coverage. Please review.

Project coverage is 24.81%. Comparing base (1a1d41b) to head (a17d858).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...da_concatenate_data/cuda_combine_cloud_handler.cpp 0.00% 114 Missing ⚠️
...enate_data/cuda_concatenate_and_time_sync_node.cpp 0.00% 15 Missing ⚠️
...catenate_data/cuda_combine_cloud_handler_kernel.cu 0.00% 7 Missing ⚠️
...enate_data/cuda_concatenate_and_time_sync_node.hpp 0.00% 3 Missing ⚠️
...da_concatenate_data/cuda_combine_cloud_handler.hpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10300      +/-   ##
==========================================
- Coverage   24.93%   24.81%   -0.13%     
==========================================
  Files        1340     1336       -4     
  Lines      104860   103769    -1091     
  Branches    39681    39322     -359     
==========================================
- Hits        26151    25747     -404     
+ Misses      76226    75584     -642     
+ Partials     2483     2438      -45     
Flag Coverage Δ *Carryforward flag
differential-cuda 0.00% <0.00%> (?)
total 24.95% <ø> (+0.01%) ⬆️ Carriedforward from b7b7fc5

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@knzo25 knzo25 merged commit a6d5003 into autowarefoundation:main Apr 22, 2025
34 of 36 checks passed
@github-project-automation github-project-automation bot moved this from To Triage to Done in Software Working Group Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:sensing Data acquisition from sensors, drivers, preprocessing. (auto-assigned) run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) type:documentation Creating or refining documentation. (auto-assigned)
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants