Skip to content

Conversation

@paramasivamn1
Copy link
Contributor

Pease review AFTS reboot related changes.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @paramasivamn1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive changes to the AFTS reboot test suite, primarily focusing on enhancing the reliability and clarity of network protocol configurations and device reboot verification. The modifications aim to streamline the setup of BGP and ISIS, improve the robustness of post-reboot checks, and refine how test-related data is handled, ultimately leading to more stable and accurate test results.

Highlights

  • Refactored BGP and ISIS Configuration: The BGP and ISIS configuration logic within the configureDUT function has been significantly refactored. New helper functions like createBGPNeighbor and updateNeighborMaxPrefix are introduced for BGP, and the isissession library is now utilized for ISIS setup, replacing previous cfgplugins usage.
  • Enhanced Reboot Verification: The reboot verification process has been improved. The waitForReboot function now directly monitors the DUT's boot time, and the rebootDUT function includes additional steps such as a 10-minute sleep, re-dialing the GNOI client, and watching for system uptime to ensure the device is fully reachable post-reboot.
  • Updated Constants and Types: Several constants related to reboot timing and BGP AS numbers have been updated or introduced. New types BGPNeighbor and IPFamily were added to facilitate the new BGP configuration structure.
  • Metadata and AFT Cache Adjustments: The metadata.textproto file was updated to remove the explicit_port_speed deviation. Additionally, the aft_cache.go file now ensures that AFT notifications are always written if they exist, regardless of missing or failing prefixes.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the AFTS reboot test, notably improving BGP and ISIS configuration by using dedicated helpers. However, the reboot handling logic introduces a critical issue: a hardcoded time.Sleep that violates the style guide and significantly slows down the test. The logic for waiting for the reboot is also complex and contains an incorrect context cancellation. Additionally, a new function for ensuring gNMI configuration persistence has been added but is not used, and there's a minor readability issue with type definitions. My review provides suggestions to fix these issues.

Comment on lines 419 to +443
if _, err = gnoiClient.System().Reboot(ctxWithTimeout, rebootRequest); err != nil {
t.Fatalf("Failed to reboot chassis with unexpected err: %v", err)
}
startReboot := time.Now()
t.Logf("Reboot request sent at %v", startReboot)
// Wait for the device to become reachable again.
// tc.waitForReboot(t, bootTimeBeforeReboot)
time.Sleep(10 * time.Minute)
gnoiClient, err = tc.dut.RawAPIs().BindingDUT().DialGNOI(t.Context())
if err != nil {
t.Fatalf("Error dialing gNOI: %v", err)
}
_, ok := gnmi.Watch(t, tc.dut, gnmi.OC().System().UpTime().State(), maxRebootTime, func(val *ygnmi.Value[uint64]) bool {
_, ok := val.Val()
return ok
}).Await(t)
if !ok {
t.Fatalf("Timeout exceeded: DUT did not reboot within %v", maxRebootTime)
}
bootTimeAfterReboot := gnmi.Get(t, tc.dut, gnmi.OC().System().BootTime().State())
t.Logf("DUT boot time after reboot: %v", bootTimeAfterReboot)
t.Logf("Device boot time after reboot: %.2f seconds.", time.Since(startReboot).Seconds())
if bootTimeAfterReboot < bootTimeBeforeReboot {
cancel()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The reboot handling logic in this function has several issues:

  1. time.Sleep Violation: The use of time.Sleep(10 * time.Minute) on line 426 violates the style guide (rule 53-54), which states to use gnmi.Watch for waiting on conditions. This hardcoded sleep makes the test inefficient and potentially flaky.
  2. Incorrect cancel(): The cancel() call on line 442 is ineffective. The context it cancels was for the unary Reboot RPC, which has already completed. To fail the test if the boot time is incorrect, t.Fatalf should be used. The waitForReboot function already contains this logic.
  3. Unnecessary Complexity: The function re-implements logic to wait for the DUT to become reachable, which is already handled correctly by the waitForReboot function. The rebootDUT function's responsibility should be to only send the reboot command.

Please refactor this to remove the time.Sleep and the manual reachability check, and instead call tc.waitForReboot(t, bootTimeBeforeReboot). Also, consider changing t.Fatalf on the Reboot RPC error to t.Logf, as a transport error is expected when the device reboots.

Suggested change
if _, err = gnoiClient.System().Reboot(ctxWithTimeout, rebootRequest); err != nil {
t.Fatalf("Failed to reboot chassis with unexpected err: %v", err)
}
startReboot := time.Now()
t.Logf("Reboot request sent at %v", startReboot)
// Wait for the device to become reachable again.
// tc.waitForReboot(t, bootTimeBeforeReboot)
time.Sleep(10 * time.Minute)
gnoiClient, err = tc.dut.RawAPIs().BindingDUT().DialGNOI(t.Context())
if err != nil {
t.Fatalf("Error dialing gNOI: %v", err)
}
_, ok := gnmi.Watch(t, tc.dut, gnmi.OC().System().UpTime().State(), maxRebootTime, func(val *ygnmi.Value[uint64]) bool {
_, ok := val.Val()
return ok
}).Await(t)
if !ok {
t.Fatalf("Timeout exceeded: DUT did not reboot within %v", maxRebootTime)
}
bootTimeAfterReboot := gnmi.Get(t, tc.dut, gnmi.OC().System().BootTime().State())
t.Logf("DUT boot time after reboot: %v", bootTimeAfterReboot)
t.Logf("Device boot time after reboot: %.2f seconds.", time.Since(startReboot).Seconds())
if bootTimeAfterReboot < bootTimeBeforeReboot {
cancel()
}
if _, err = gnoiClient.System().Reboot(ctxWithTimeout, rebootRequest); err != nil {
// This error is expected because the connection is broken on reboot.
t.Logf("Reboot request failed as expected, or with an unexpected error: %v", err)
}
// Wait for the device to become reachable again and for the boot time to update.
tc.waitForReboot(t, bootTimeBeforeReboot)

Comment on lines +120 to 131
func (tc *testCase) configureToStoreRunninggNMIConfig(t *testing.T) error {
hwProfileConfig := map[enpb.VendorId]string{
enpb.VendorId_V_ARISTA: "management api gnmi \n transport grpc default \n operation set persistence \n",
}
statement.GetOrCreateActions().PolicyResult = applyPolicyType
gnmi.Update(t, dut, gnmi.OC().RoutingPolicy().Config(), routePolicy)
tc.dut.Config().New().
WithAristaText(hwProfileConfig[enpb.VendorId_V_ARISTA]).
WithCiscoText(hwProfileConfig[enpb.VendorId_V_CISCOXR]).
WithJuniperText(hwProfileConfig[enpb.VendorId_V_JUNIPER]).
Append(t)
t.Logf("hwProfileConfig: %v added \n", hwProfileConfig[enpb.VendorId_V_ARISTA])
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The function configureToStoreRunninggNMIConfig is defined but it is not called anywhere in the test. This seems to be an oversight. If this configuration is necessary for the test to work correctly after a reboot, it should be called, probably at the beginning of configureDUT.

Additionally, the hwProfileConfig map only contains a configuration for Arista. For other vendors like Cisco and Juniper, it will pass an empty string. If this is intentional because no configuration is needed for them, please add a comment to clarify. If configurations are pending, a TODO comment would be helpful.
Example:

	hwProfileConfig := map[enpb.VendorId]string{
		enpb.VendorId_V_ARISTA: "management api gnmi \n transport grpc default \n operation set persistence \n",
		// enpb.VendorId_V_CISCOXR: "", // No equivalent config needed.
		// enpb.VendorId_V_JUNIPER: "", // No equivalent config needed.
	}

@coveralls
Copy link

coveralls commented Jan 5, 2026

Pull Request Test Coverage Report for Build 20711325825

Details

  • 0 of 1 (0.0%) changed or added relevant line in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 10.023%

Changes Missing Coverage Covered Lines Changed/Added Lines %
internal/telemetry/aftcache/aft_cache.go 0 1 0.0%
Totals Coverage Status
Change from base Build 20704703562: 0.0%
Covered Lines: 2227
Relevant Lines: 22219

💛 - Coveralls

@OpenConfigBot
Copy link

Pull Request Functional Test Report for #4969 / f3e03a6

Virtual Devices

Device Test Test Documentation Job Raw Log
Arista cEOS status
AFT-5.1: AFTs DUT Reboot
7222f743 Log
Cisco 8000E status
AFT-5.1: AFTs DUT Reboot
44a41c57 Log
Cisco XRd status
AFT-5.1: AFTs DUT Reboot
9fba04b2 Log
Juniper ncPTX status
AFT-5.1: AFTs DUT Reboot
d9eaba77 Log
Nokia SR Linux status
AFT-5.1: AFTs DUT Reboot
a54671dd Log
Openconfig Lemming status
AFT-5.1: AFTs DUT Reboot
1582b8e5 Log

Hardware Devices

Device Test Test Documentation Raw Log
Arista 7808 status
AFT-5.1: AFTs DUT Reboot
Cisco 8808 status
AFT-5.1: AFTs DUT Reboot
Juniper PTX10008 status
AFT-5.1: AFTs DUT Reboot
Nokia 7250 IXR-10e status
AFT-5.1: AFTs DUT Reboot

Help

Copy link
Contributor

@ElodinLaarz ElodinLaarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of these changes should not happen. Can you try to make the changes in the cfgplugin library if the test has issues with ISIS / BGP?

Comment on lines +160 to +168
if deviations.MultipathUnsupportedNeighborOrAfisafi(dut) {
peerGroupV4.GetOrCreateUseMultiplePaths().SetEnabled(true)
peerGroupV6.GetOrCreateUseMultiplePaths().SetEnabled(true)
} else {
afiSAFI.GetOrCreateUseMultiplePaths().GetOrCreateEbgp().SetMaximumPaths(2)
asisafi6.GetOrCreateUseMultiplePaths().GetOrCreateEbgp().SetMaximumPaths(2)
peerGroupV4AfiSafi.GetOrCreateUseMultiplePaths().SetEnabled(true)
peerGroupV6AfiSafi.GetOrCreateUseMultiplePaths().SetEnabled(true)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle the deviation in its own function and handle all vendors, explicitly. (I can't tell what vendors should have this value set or not as-is.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. will modify.

"github.com/openconfig/featureprofiles/internal/cfgplugins"
"github.com/openconfig/featureprofiles/internal/deviations"
"github.com/openconfig/featureprofiles/internal/fptest"
"github.com/openconfig/featureprofiles/internal/isissession"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package is deprecated. You should not be using it in new changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

global := isis.GetOrCreateGlobal()
global.HelloPadding = oc.Isis_HelloPaddingType_DISABLE

if deviations.ISISSingleTopologyRequired(ts.DUT) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here about properly handling deviations. They should be handled explicitly in their own functions and we should be able to see which vendor(s) are expected to have the value set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify accordingly.

}

// configureDUT configures all the interfaces, BGP, and ISIS on the DUT.
func createBGPNeighbor(peerGrpNameV4, peerGrpNameV6 string, nbrs []*BGPNeighbor, dut *ondatra.DUTDevice) *oc.NetworkInstance_Protocol {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't in-line the BGP configuration-- please use / update the cfgplugins library so the config can be used across tests.

@ElodinLaarz
Copy link
Contributor

I don't think you uploaded any changes you made? Could you commit and upload before I can review it again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants