Make PostUpdate run serially#3512
Open
luca-della-vedova wants to merge 1 commit intogazebosim:mainfrom
Open
Conversation
Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🎉 New feature
Remove parallel PostUpdate runs!
Summary
When profiling Gazebo worlds I realized a major bottleneck in synchronization primitives. Digging a bit deeper I found out this was due to the parallel
PostUpdates and the synchronization between threads in theBarrierimplementation.I investigated and benchmarked three different approaches:
gz::common::WorkerPoolin this branch.Test it
To be honest I struggled a bit with coming up with a scenario where parallel
PostUpdates would help. The ideal would be a world with a lot of PostUpdates that do a lot of heavy work, however looking at the codebase it seems most systems do fairly trivial work onPostUpdate.The benchmark I came up with instantiates a lot of
TouchPluginsystems on static entities (200). The plugin does a decent amount of work on itsPostUpdatefunction (which calls this), but if anyone can come up with a better benchmark I'm happy to change it.The script I used to create the world is here.
I ran both worlds with bullet featherstone and didn't cap the RTF so it would run as fast as possible:
Results
Tests ran in a cloud machine with many CPU cores but no GPU. Results in % of RTF.
Discussion
Apart from the clear result that the current barrier based implementation is too inefficient, the performance difference on a real world between the
WorkerPoolbased approach and a naive serial PostUpdate is fairly small and almost within noise.For a stress test however, when we have a lot of PostUpdate systems, the overhead of the thread scheduling is just too high and a naive serial implementation (this PR) is a lot faster, as well as being a lot simpler to reason about.
Of course this result is due to the design of the benchmark, which is why I'm happy to hear about any case where there might be numerous expensive PostUpdates that can be parallelized that I should test.
Checklist
codecheckpassed (See contributing)Generated-by: This PR wasn't but the benchmark and the alternative implementations were generated by Gemini 3.0
Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining
Signed-off-byandGenerated-bymessages.