Fix issue: pmon services's restart count is not cleared during config reload#4314
Open
stephenxs wants to merge 1 commit intosonic-net:masterfrom
Open
Fix issue: pmon services's restart count is not cleared during config reload#4314stephenxs wants to merge 1 commit intosonic-net:masterfrom
stephenxs wants to merge 1 commit intosonic-net:masterfrom
Conversation
… reload What I did Currently, when "config reload" is executed, services' restart count are cleared to avoid reaching restart limit. This is done by listing all services using command systemctl list-dependencies --plain .target. However, this doesn't include pmon service, neither all other services that don't have WantedBy=sonic.target, which means pmon's start count is not cleared. After multi-ASIC PRs are merged, there is a high probability that pmon fails to restart due to reaching start limit (3 times in 1200 seconds). The pmon service can be started by featured, syncd during config reload. Before multi-ASIC, pmon depends on syncd. The dependency is removed after multi-ASIC, which means pmon can restart immediately triggered by sonic.target which is once more restarting. As a result the pmon service is more likely to reach the restart limit. How I did it Clear restart count also for services that have reverse dependency on sonic.target. How to verify it Previous command output (if the output of a command-line utility has changed) New command output (if the output of a command-line utility has changed) Signed-off-by: Stephen Sun <stephens@nvidia.com>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
Author
|
/azpw run |
Collaborator
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
keboliu
approved these changes
Mar 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What I did
Currently, when "config reload" is executed, services' restart count are cleared to avoid reaching restart limit. This is done by listing all services using command systemctl list-dependencies --plain .target. However, this doesn't include pmon service, neither all other services that don't have WantedBy=sonic.target, which means pmon's start count is not cleared.
How I did it
Sometimes pmon fails to restart due to reaching start limit (3 times in 1200 seconds). The pmon service can be started by featured, syncd during config reload. Before multi-ASIC, pmon depends on syncd. The dependency is removed after multi-ASIC, which means pmon can restart immediately triggered by sonic.target which is once more restarting. As a result the pmon service is more likely to reach the restart limit.
How to verify it
Clear restart count also for services that have reverse dependency on sonic.target.
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)