-
Notifications
You must be signed in to change notification settings - Fork 395
Description
Describe the bug
When starting several controllers simultaneously using the spawner, one or more of the controllers fail to start up
To Reproduce
Steps to reproduce the behavior:
- Start our launch file with several controllers(in our case, we start 16 controllers)
- See following error:
[ros2_control_node-2] [ERROR] [1733859175.169085071] [controller_manager]: A controller named 'my_controller_1' was already loaded inside the controller manager
[ros2_control_node-2] [ERROR] [1733859175.491164152] [controller_manager]: A controller named 'my_controller_2' was already loaded inside the controller manager
[ros2_control_node-2] [ERROR] [1733859175.491519034] [controller_manager]: A controller named 'my_controller_3' was already loaded inside the controller manager
[ros2_control_node-2] [ERROR] [1733859175.492405566] [controller_manager]: A controller named 'my_controller_1' was already loaded inside the controller manager
[my_controller_1_spawner-13] [FATAL] [1733859175.539516796] [spawner_my_controller_1]: Failed loading controller my_controller_1
[my_controller_2_spawner-9] [FATAL] [1733859175.577969123] [spawner_my_controller_2]: Failed loading controller my_controller_2
[my_controller_3_spawner-11] [FATAL] [1733859175.578179280] [spawner_my_controller_3]: Failed loading controller my_controller_3
[ERROR] [my_controller_3_spawner-11]: process has died [pid 3598950, exit code 1, cmd 'ros2 run controller_manager spawner my_controller_3 --param-file /tmp/my_controller_3.yaml --controller-manager-timeout 40 --ros-args --log-level ERROR --log-file-name my_controller_3_spawner'].
[ERROR] [my_controller_2_spawner-9]: process has died [pid 3598944, exit code 1, cmd 'ros2 run controller_manager spawner my_controller_2 --param-file /tmp/my_controller_2.yaml --controller-manager-timeout 40 --ros-args --log-level ERROR --log-file-name my_controller_2_spawner'].
[ERROR] [my_controller_1_spawner-13]: process has died [pid 3598954, exit code 1, cmd 'ros2 run controller_manager spawner my_controller_1 --param-file /tmp/my_controller_1.yaml --controller-manager-timeout 40 --ros-args --log-level ERROR --log-file-name my_controller_1_spawner'].
Expected behavior
All controllers come up without error
Screenshots
N/A
Environment (please complete the following information):
- OS: Ubuntu 24.04
- Version: rolling sync from 2024-11-26
- ros2_control version: 4.20.0
Additional context
The number of controllers that don't load is random. Out of our 16 controllers, we have 7 instances of a particular type of a custom controller. Anecdotally, it is usually one or more these controllers that don't load up.
I can't really produce a minimal example. In fact, we only see this on our target hardware and believe it is related to CPU load.
I can't repro this on my laptop as it is much more powerful than the target hardware. Additionally, I have tried to preload my laptop with additional load using 'stress' but I still couldn't repro the issue so it may be something else e.g. disk I/O.
I don't understand the error message as it indicates the controller is already loaded but it isn't.
Note; we religiously update our software to the latest rolling sync every month.
Anecdotally, it seems there was some improvement when we went from ros2_control version 4.18(~10% success rate) to 4.20(~40-50% success rate).
It seems this issue has cropped up only in the last couple of months. A few months ago, failures were fairly rare. The above error message also is something that is fairly new to us as previously the occasional failures just had the controller "process has died" message. I realize there has been a fair amount of work in the spawner area recently.
This is a very serious issue for us as it takes several tries for our SQA to start our software stack when they are testing.