fix: Reduce idle CPU consumption of websocket server#1040
fix: Reduce idle CPU consumption of websocket server#1040
Conversation
sea-bass
left a comment
There was a problem hiding this comment.
Nice!
Re: using EventsExecutor, maybe we could create some kind of feature flag (based on environment variables or something), just so it can be tested by users? Separate PR if you want to try this though.
| executor = rclpy.executors.SingleThreadedExecutor() | ||
| executor.add_node(node) | ||
| while rclpy.ok() and not shutdown_event.is_set(): | ||
| executor.spin_once(timeout_sec=1.0) |
There was a problem hiding this comment.
Worried about this timeout being a hard-coded value, although it is just one spin...
There was a problem hiding this comment.
We need some timeout to make sure the thread wakes up once in a while to check for shutdown event. Alternatively, I can try running executor.spin() and calling executor.shutdown() from the main thread.
There was a problem hiding this comment.
I tried that and now one of the tests is timing out. I'm changing this PR to draft until I'll figure this out
There was a problem hiding this comment.
The tests are timing out due to bugs in rclpy. I submitted a PR with the fix in ros2/rclpy#1469.
| binary = False | ||
|
|
||
| _io_loop.add_callback(partial(self.prewrite_message, message, binary)) | ||
| asyncio.run_coroutine_threadsafe(self.prewrite_message(message, binary), cls.event_loop) |
There was a problem hiding this comment.
If event_loop is None (its default value), will this error? Should we check?
There was a problem hiding this comment.
It probably will, the same with node_handle. I'm not happy with passing arguments through class variables but I just tried to adhere to how it is currently done.
We could just add a ROS parameter that enables it. Should be fairly easy, I might create a PR after this. |
|
Hey @bjsowa, are you still interested in addressing this? I'd be happy to contribute some of my or my team's time to testing or otherwise helping finish this up - we've got a customer who is looking to reduce their overall system load from our application, and rosbridge is a big culprit. |
I am, but this requires my fixes in rclpy (ros2/rclpy#1469) which I am actively trying to get merged. |
f0ac970 to
4875834
Compare
0810c3c to
3916b0f
Compare
|
@EzraBrooks My fixes in rclpy have been merged. Once (or if) they are backported to Jazzy and Kilted I can revisit this PR |
|
@Carter12s ☝🏻 since you and I were just talking last night about this problem |
|
Bumping so I get notifications (I'm now at PickNik - the company that originally submitted this via @EzraBrooks). Thanks for your work on this, @bjsowa! |
|
Looks like ros2/rclpy#1469 was merged and backported! |
|
It's merged for |
|
Also, |
|
|
Public API Changes
None
Description
This PR removes the use of tornado.ioloop and instead uses the
asyncioevent loop interface, as recommended in tornado documentation since version 6.0.The ROS spinning is now done in a separate thread instead of a periodic async callback.
SIGINT and SIGTERM signal handling is now done manually to ensure graceful shutdown.
In my case, the idle CPU consumption went from 9% to 0%.
With a client connected, the consumption went from 38% to 34%.
This is still quite high but a good step forward. I tried improving it further by using EventsExecutor and the consumption with the client connected went from 34% to 18% which is huge but I'm reluctant to use it as it is experimental.