Wait for monit monitor <service> operation to complete during config reload#4295
Open
tirupatihemanth wants to merge 1 commit intosonic-net:masterfrom
Open
Wait for monit monitor <service> operation to complete during config reload#4295tirupatihemanth wants to merge 1 commit intosonic-net:masterfrom
tirupatihemanth wants to merge 1 commit intosonic-net:masterfrom
Conversation
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the config reload/load_minigraph service restart flow in sonic-utilities to avoid leaving monit checkers in a Not monitored state by waiting for monit monitor <service> to actually take effect before reloading monit.
Changes:
- Added a polling helper to wait until a monit service is no longer reported as “Not monitored”.
- Replaced the fixed
sleep(1)aftermonit monitor ...with explicit waits forrouteCheckandcontainer_checker.
297b718 to
a249af1
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
reload Signed-off-by: Hemanth Kumar Tirupati <htirupati@nvidia.com>
a249af1 to
472f9ff
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
dgsudharsan
approved these changes
Feb 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes sonic-net/sonic-buildimage#25599
What I did
Wait for monit monitor operation to complete before monit reload operation
Why I did
There is a race condition in the implementation of config reload in sonic code.
During config reload -y -f we ask monit to unmonitor container_checker to avoid errors as containers go down. And during restarting as part of the reload we ask monit to monitor container_checker again. For this we using "monit monitor container_checker" command. This is an async operation. Please see below
https://github.com/sonic-net/sonic-utilities/blob/cbb31f0d65c6768107f2089f6c75a617d8b519b4/config/main.py#L1058C1-L1068C55
During monit reload, monit saves and restores monitoring state and since container_checker was not enabled so it will forever remained unmonitored from this point onwards.
How I did it
Wait till monitor action is complete before monit reload
How to verify it
monit checkers will not be left in not monitored state after config reload