Skip to content

Limit PFC WD Detection time to maximum value of 1000[ms]#4188

Closed
dprital wants to merge 2 commits intosonic-net:masterfrom
dprital:master_limit_pfc_wd_detection_time
Closed

Limit PFC WD Detection time to maximum value of 1000[ms]#4188
dprital wants to merge 2 commits intosonic-net:masterfrom
dprital:master_limit_pfc_wd_detection_time

Conversation

@dprital
Copy link
Collaborator

@dprital dprital commented Jan 12, 2026

Fixing: sonic-net/sonic-buildimage#25033
Should be merged before: sonic-net/sonic-buildimage#25034

What I did

PFC WD Detection time is defined by the following calculation:

DEFAULT_POLL_INTERVAL * multiply

where:

  • multiply = max(1, (port_num-1)//DEFAULT_PORT_NUM+1)
  • port_num = len(list(self.config_db.get_table('PORT').keys()))
  • DEFAULT_POLL_INTERVAL = 200
  • DEFAULT_PORT_NUM = 32

There is an allowed range for this value which is between 100..3000 [ms].
For system with more than 448 ports, we will violate this range.

In addition, there is no meaning to have detection time of more than 1000[ms], hence, this change limit the maximum detection time for PFC WD to 1000[ms]

How I did it

Calculate PFC WD Detection time, if it become bigger than 1000[ms], set it to 1000[ms]

How to verify it

Run PFC WD Tests on several platforms

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@bingwang-ms
Copy link
Contributor

@kperumalbfn Can you help review?

@bingwang-ms
Copy link
Contributor

@dprital CPU consumption is a concern. I think that's also why we calculate the interval based on the number of ports. How do we ensure CPU has enough time to handle the redis queries?

@dprital
Copy link
Collaborator Author

dprital commented Jan 21, 2026

@dprital CPU consumption is a concern. I think that's also why we calculate the interval based on the number of ports. How do we ensure CPU has enough time to handle the redis queries?

Tested on two systems (MSN2700 and MSN4600C) by running pfc_wd sonic-mgmt tests. all tests passed.

@liat-grozovik
Copy link
Collaborator

liat-grozovik commented Feb 2, 2026

@dprital can you please add the explanation of the logic as well as the max set to the code? and please extend the code coverage by extending the code to also check the 32 ports systems and 512 ports system.

@stephenxs stephenxs self-requested a review February 3, 2026 05:40
@dprital dprital force-pushed the master_limit_pfc_wd_detection_time branch from 55fee42 to 64fda1a Compare February 3, 2026 10:24
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dprital dprital marked this pull request as draft February 3, 2026 10:30
@dprital dprital force-pushed the master_limit_pfc_wd_detection_time branch from 64fda1a to beeccb6 Compare February 5, 2026 00:20
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dprital dprital force-pushed the master_limit_pfc_wd_detection_time branch from beeccb6 to cda4aac Compare February 5, 2026 07:24
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: dprital <drorp@nvidia.com>
@dprital dprital force-pushed the master_limit_pfc_wd_detection_time branch from cda4aac to 18e7130 Compare February 19, 2026 00:24
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dprital dprital force-pushed the master_limit_pfc_wd_detection_time branch from 6ec96e2 to 449c045 Compare February 19, 2026 13:05
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dprital dprital force-pushed the master_limit_pfc_wd_detection_time branch from 449c045 to 6655b4f Compare February 19, 2026 15:26
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dprital dprital force-pushed the master_limit_pfc_wd_detection_time branch from 6655b4f to 932c595 Compare February 19, 2026 15:42
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: dprital <drorp@nvidia.com>
@dprital dprital force-pushed the master_limit_pfc_wd_detection_time branch from 932c595 to 0a07430 Compare February 19, 2026 15:49
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dprital dprital marked this pull request as ready for review February 20, 2026 14:24
@dgsudharsan
Copy link
Collaborator

Closing this in favor of #4306

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants