reschedule the reconcile loop when DOCA PCC process dies#289
Merged
almaslennikov merged 1 commit intonetwork-operator-26.1.xfrom Feb 13, 2026
Merged
reschedule the reconcile loop when DOCA PCC process dies#289almaslennikov merged 1 commit intonetwork-operator-26.1.xfrom
almaslennikov merged 1 commit intonetwork-operator-26.1.xfrom
Conversation
We need to restart the PCC and reapply its params Signed-off-by: Alexander Maslennikov <amaslennikov@nvidia.com> (cherry picked from commit 9f5de2e)
Greptile OverviewGreptile SummaryThis PR implements automatic recovery when the DOCA PCC (Precision Congestion Control) process terminates unexpectedly. The implementation adds a notification mechanism that triggers controller reconciliation to restart the PCC process and reapply its configuration parameters. Key Changes:
Implementation Quality:
Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant PCC as DOCA PCC Process
participant SM as SpectrumXManager
participant Chan as ccTerminationChan
participant Bridge as Bridge Goroutine
participant Controller as NicDeviceController
participant Reconcile as Reconcile Loop
Note over PCC,SM: PCC process starts successfully
PCC->>SM: Process running (3s startup check passed)
SM->>SM: Set startupCheckPassed = true
Note over PCC: Process dies unexpectedly
PCC->>SM: Process termination detected
SM->>SM: Check startupCheckPassed.Load()
SM->>Chan: Send RdmaInterface name
Chan->>Bridge: Receive RdmaInterface
Bridge->>Bridge: Log CC termination event
Bridge->>Controller: Send TypedGenericEvent
Controller->>Controller: Enqueue reconcile request
Controller->>Reconcile: Trigger reconciliation
Reconcile->>Reconcile: ApplyDeviceRuntimeSpec()
Reconcile->>SM: RunDocaSpcXCC()
SM->>PCC: Restart PCC process
Note over PCC,SM: PCC restarted with params reapplied
|
rollandf
approved these changes
Feb 13, 2026
acbfe66
into
network-operator-26.1.x
12 of 14 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We need to restart the PCC and reapply its params
(cherry picked from commit 9f5de2e)