Skip to content

Limit PFC WD Detection time to maximum value of 1000ms#4306

Open
tirupatihemanth wants to merge 3 commits intosonic-net:masterfrom
tirupatihemanth:master_limit_pfc_wd_detection_time
Open

Limit PFC WD Detection time to maximum value of 1000ms#4306
tirupatihemanth wants to merge 3 commits intosonic-net:masterfrom
tirupatihemanth:master_limit_pfc_wd_detection_time

Conversation

@tirupatihemanth
Copy link
Contributor

Fixing: sonic-net/sonic-buildimage#25033
Should be merged before: sonic-net/sonic-buildimage#25034

What I did
PFC WD Detection time is defined by the following calculation:

DEFAULT_POLL_INTERVAL * multiply

where:

multiply = max(1, (port_num-1)//DEFAULT_PORT_NUM+1)
port_num = len(list(self.config_db.get_table('PORT').keys()))
DEFAULT_POLL_INTERVAL = 200
DEFAULT_PORT_NUM = 32
There is an allowed range for this value which is between 100..3000 [ms].
For system with more than 448 ports, we will violate this range.

In addition, there is no meaning to have detection time of more than 1000[ms], hence, this change limit the maximum detection time for PFC WD to 1000[ms]

How I did it
Calculate PFC WD Detection time, if it become bigger than 1000[ms], set it to 1000[ms]

How to verify it
Run PFC WD Tests on several platforms

dprital and others added 3 commits February 19, 2026 02:23
Signed-off-by: dprital <drorp@nvidia.com>
Signed-off-by: dprital <drorp@nvidia.com>
Signed-off-by: Hemanth Kumar Tirupati <htirupati@nvidia.com>
Copilot AI review requested due to automatic review settings February 26, 2026 16:27
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates SONiC’s PFC watchdog CLI behavior to cap the configured polling interval at 1000ms (including start_default computation and CLI validation), and extends unit tests to cover scaling behavior across different port counts.

Changes:

  • Add MAX_POLL_INTERVAL_TIME = 1000 and use it for pfcwd interval Click validation.
  • Cap the PFC_WD|GLOBAL|POLL_INTERVAL value produced by start_default to 1000ms.
  • Add/extend unit tests and expected outputs for 32-port and 512-port scenarios.
  • Update config pfcwd interval argument validation to a 1000ms max.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
pfcwd/main.py Introduces max poll-interval constant, caps start_default poll interval, updates Click range validation.
config/main.py Narrows config pfcwd interval argument range to max 1000ms and documents sync expectations.
tests/pfcwd_test.py Adds new tests to validate start_default behavior for 32-port and 512-port systems.
tests/pfcwd_input/pfcwd_test_vectors.py Adds expected outputs corresponding to the new tests (including poll interval cap).
Comments suppressed due to low confidence (1)

pfcwd/main.py:430

  • The PR title/description talk about capping detection time to 1000ms, but this change only caps the GLOBAL POLL_INTERVAL. detection_time/restoration_time still scale with multiply and can exceed 1000ms (and even 3000ms on large systems). If the intent is truly to cap detection/restoration times, apply the same cap there; otherwise update the PR title/description (and related comments/tests) to reflect that only the polling interval is capped.
        pfc_wd_poll_interval_time = DEFAULT_POLL_INTERVAL * multiply
        if pfc_wd_poll_interval_time > MAX_POLL_INTERVAL_TIME:
            pfc_wd_poll_interval_time = MAX_POLL_INTERVAL_TIME

        pfcwd_info = {
            'detection_time': DEFAULT_DETECTION_TIME * multiply,
            'restoration_time': DEFAULT_RESTORATION_TIME * multiply,
            'action': DEFAULT_ACTION,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants