Skip to content

Conversation

@jaredmauch
Copy link
Contributor

The stop_topology() method previously stopped routers sequentially, which could be slow for topologies with many routers. This change optimizes the shutdown process by:

  1. Sending SIGTERM to all daemons on all routers in parallel
  2. Waiting for all daemons to stop together in a single loop
  3. Force stopping any remaining daemons with SIGBUS
  4. Collecting errors from all routers after shutdown

@frrbot frrbot bot added the tests Topotests, make check, etc label Nov 13, 2025
@jaredmauch jaredmauch force-pushed the test_runtime_improvements branch from 1e97b3d to 6305700 Compare November 13, 2025 11:45
@jaredmauch jaredmauch force-pushed the test_runtime_improvements branch 3 times, most recently from f3297de to 3285ba0 Compare November 13, 2025 13:47
@vayetze vayetze requested a review from choppsv1 November 18, 2025 16:25
@jaredmauch jaredmauch force-pushed the test_runtime_improvements branch from e9137b3 to 8fbfe54 Compare November 19, 2025 02:24
jaredmauch pushed a commit to jaredmauch/frr that referenced this pull request Nov 20, 2025
Add missing scope check before accessing interface pointer when handling
unknown LSAs with U-bit clear. AS-scope LSAs use process pointer, not
interface, causing incorrect flooding behavior.

Discovered during CI/CD testing of PR FRRouting#20030 in ospf6_point_to_multipoint test.

Signed-off-by: jared mauch <[email protected]>
jaredmauch pushed a commit to jaredmauch/frr that referenced this pull request Nov 20, 2025
Add missing scope check before accessing interface pointer when handling
unknown LSAs with U-bit clear. AS-scope LSAs use process pointer, not
interface, causing incorrect flooding behavior.

Discovered during CI/CD testing of PR FRRouting#20030 in ospf6_point_to_multipoint test.

Signed-off-by: jared mauch <[email protected]>
@jaredmauch
Copy link
Contributor Author

jaredmauch commented Nov 20, 2025

Test failures are due to #20081 perhaps @ton31337 will accept the improvements

jaredmauch pushed a commit to jaredmauch/frr that referenced this pull request Nov 20, 2025
Add missing scope check before accessing interface pointer when handling
unknown LSAs with U-bit clear. AS-scope LSAs use process pointer, not
interface, causing incorrect flooding behavior.

Discovered during CI/CD testing of PR FRRouting#20030 in ospf6_point_to_multipoint test.

Signed-off-by: jared mauch <jared@debian>
jaredmauch added a commit to jaredmauch/frr that referenced this pull request Nov 20, 2025
Add missing scope check before accessing interface pointer when handling
unknown LSAs with U-bit clear. AS-scope LSAs use process pointer, not
interface, causing incorrect flooding behavior.

Discovered during CI/CD testing of PR FRRouting#20030 in ospf6_point_to_multipoint test.

Signed-off-by: jared mauch <[email protected]>
jaredmauch added a commit to jaredmauch/frr that referenced this pull request Nov 20, 2025
Add missing scope check before accessing interface pointer when handling
unknown LSAs with U-bit clear. AS-scope LSAs use process pointer, not
interface, causing incorrect flooding behavior.

Discovered during CI/CD testing of PR FRRouting#20030 in ospf6_point_to_multipoint test.

Signed-off-by: jared mauch <[email protected]>
jaredmauch added a commit to jaredmauch/frr that referenced this pull request Nov 20, 2025
Add missing scope check before accessing interface pointer when handling
unknown LSAs with U-bit clear. AS-scope LSAs use process pointer, not
interface, causing incorrect flooding behavior.

Discovered during CI/CD testing of PR FRRouting#20030 in ospf6_point_to_multipoint test.

Signed-off-by: jared mauch <[email protected]>
jaredmauch added a commit to jaredmauch/frr that referenced this pull request Nov 23, 2025
Add missing scope check before accessing interface pointer when handling
unknown LSAs with U-bit clear. AS-scope LSAs use process pointer, not
interface, causing incorrect flooding behavior.

Discovered during CI/CD testing of PR FRRouting#20030 in ospf6_point_to_multipoint test.

Signed-off-by: jared mauch <[email protected]>
@jaredmauch jaredmauch force-pushed the test_runtime_improvements branch from 8fbfe54 to 5d09cbb Compare November 23, 2025 20:55
@github-actions github-actions bot added size/XXL and removed size/L labels Nov 30, 2025
@jaredmauch jaredmauch force-pushed the test_runtime_improvements branch 2 times, most recently from 126f9d4 to fca0610 Compare November 30, 2025 03:24
@github-actions github-actions bot added size/L and removed size/XXL labels Nov 30, 2025
@choppsv1
Copy link
Contributor

choppsv1 commented Dec 2, 2025

NOTE: working with @jaredmauch now to get this in.

@choppsv1 choppsv1 force-pushed the test_runtime_improvements branch 4 times, most recently from af24683 to 20c5b77 Compare December 3, 2025 06:22
jaredmauch and others added 2 commits December 4, 2025 15:04
The stop_topology() method previously stopped routers sequentially, which could
be slow for topologies with many routers. This change optimizes the shutdown
process by:

1) Sending SIGTERM to all daemons on all routers in parallel
2) Waiting for all daemons to stop together in a single loop
3) Force stopping any remaining daemons with SIGBUS
4) Collecting errors from all routers after shutdown

- Also make the TopoGear.stop() API consistent.

Signed-off-by: Jared Mauch <[email protected]>
Signed-off-by: Christian Hopps <[email protected]>
@choppsv1 choppsv1 force-pushed the test_runtime_improvements branch from 20c5b77 to a98610e Compare December 4, 2025 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants