Skip to content

Fix testbed deadlock from fork in threads#22706

Open
suwinkumar-arista wants to merge 1 commit intosonic-net:masterfrom
suwinkumar-arista:fix-deadlock-fork
Open

Fix testbed deadlock from fork in threads#22706
suwinkumar-arista wants to merge 1 commit intosonic-net:masterfrom
suwinkumar-arista:fix-deadlock-fork

Conversation

@suwinkumar-arista
Copy link
Contributor

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
    • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Approach

What is the motivation for this PR?

Fixes deadlock observed in a few servers where tests would hang indefinitely. Root cause was fork() being called within
thread context via SafeThreadPoolExecutor, causing child processes to inherit locked import mutex that could never be released (orphan lock scenario). This is more prevalent in DualToR scenarios where shell commands run in parallel across two threads.

How did you do it?

Solution serializes collect_before_test, collect_after_test, and print_logs operations to eliminate fork within threading context.

How did you verify/test it?

Tested on Arista-7050CX3 running dualtor-aa topology

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Signed-off-by: suwinkumar-arista <suwinkumar@arista.com>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants