-
Notifications
You must be signed in to change notification settings - Fork 197
AFTS reboot changes #4969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
AFTS reboot changes #4969
Conversation
Summary of ChangesHello @paramasivamn1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces comprehensive changes to the AFTS reboot test suite, primarily focusing on enhancing the reliability and clarity of network protocol configurations and device reboot verification. The modifications aim to streamline the setup of BGP and ISIS, improve the robustness of post-reboot checks, and refine how test-related data is handled, ultimately leading to more stable and accurate test results. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the AFTS reboot test, notably improving BGP and ISIS configuration by using dedicated helpers. However, the reboot handling logic introduces a critical issue: a hardcoded time.Sleep that violates the style guide and significantly slows down the test. The logic for waiting for the reboot is also complex and contains an incorrect context cancellation. Additionally, a new function for ensuring gNMI configuration persistence has been added but is not used, and there's a minor readability issue with type definitions. My review provides suggestions to fix these issues.
| if _, err = gnoiClient.System().Reboot(ctxWithTimeout, rebootRequest); err != nil { | ||
| t.Fatalf("Failed to reboot chassis with unexpected err: %v", err) | ||
| } | ||
| startReboot := time.Now() | ||
| t.Logf("Reboot request sent at %v", startReboot) | ||
| // Wait for the device to become reachable again. | ||
| // tc.waitForReboot(t, bootTimeBeforeReboot) | ||
| time.Sleep(10 * time.Minute) | ||
| gnoiClient, err = tc.dut.RawAPIs().BindingDUT().DialGNOI(t.Context()) | ||
| if err != nil { | ||
| t.Fatalf("Error dialing gNOI: %v", err) | ||
| } | ||
| _, ok := gnmi.Watch(t, tc.dut, gnmi.OC().System().UpTime().State(), maxRebootTime, func(val *ygnmi.Value[uint64]) bool { | ||
| _, ok := val.Val() | ||
| return ok | ||
| }).Await(t) | ||
| if !ok { | ||
| t.Fatalf("Timeout exceeded: DUT did not reboot within %v", maxRebootTime) | ||
| } | ||
| bootTimeAfterReboot := gnmi.Get(t, tc.dut, gnmi.OC().System().BootTime().State()) | ||
| t.Logf("DUT boot time after reboot: %v", bootTimeAfterReboot) | ||
| t.Logf("Device boot time after reboot: %.2f seconds.", time.Since(startReboot).Seconds()) | ||
| if bootTimeAfterReboot < bootTimeBeforeReboot { | ||
| cancel() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reboot handling logic in this function has several issues:
time.SleepViolation: The use oftime.Sleep(10 * time.Minute)on line 426 violates the style guide (rule 53-54), which states to usegnmi.Watchfor waiting on conditions. This hardcoded sleep makes the test inefficient and potentially flaky.- Incorrect
cancel(): Thecancel()call on line 442 is ineffective. The context it cancels was for the unaryRebootRPC, which has already completed. To fail the test if the boot time is incorrect,t.Fatalfshould be used. ThewaitForRebootfunction already contains this logic. - Unnecessary Complexity: The function re-implements logic to wait for the DUT to become reachable, which is already handled correctly by the
waitForRebootfunction. TherebootDUTfunction's responsibility should be to only send the reboot command.
Please refactor this to remove the time.Sleep and the manual reachability check, and instead call tc.waitForReboot(t, bootTimeBeforeReboot). Also, consider changing t.Fatalf on the Reboot RPC error to t.Logf, as a transport error is expected when the device reboots.
| if _, err = gnoiClient.System().Reboot(ctxWithTimeout, rebootRequest); err != nil { | |
| t.Fatalf("Failed to reboot chassis with unexpected err: %v", err) | |
| } | |
| startReboot := time.Now() | |
| t.Logf("Reboot request sent at %v", startReboot) | |
| // Wait for the device to become reachable again. | |
| // tc.waitForReboot(t, bootTimeBeforeReboot) | |
| time.Sleep(10 * time.Minute) | |
| gnoiClient, err = tc.dut.RawAPIs().BindingDUT().DialGNOI(t.Context()) | |
| if err != nil { | |
| t.Fatalf("Error dialing gNOI: %v", err) | |
| } | |
| _, ok := gnmi.Watch(t, tc.dut, gnmi.OC().System().UpTime().State(), maxRebootTime, func(val *ygnmi.Value[uint64]) bool { | |
| _, ok := val.Val() | |
| return ok | |
| }).Await(t) | |
| if !ok { | |
| t.Fatalf("Timeout exceeded: DUT did not reboot within %v", maxRebootTime) | |
| } | |
| bootTimeAfterReboot := gnmi.Get(t, tc.dut, gnmi.OC().System().BootTime().State()) | |
| t.Logf("DUT boot time after reboot: %v", bootTimeAfterReboot) | |
| t.Logf("Device boot time after reboot: %.2f seconds.", time.Since(startReboot).Seconds()) | |
| if bootTimeAfterReboot < bootTimeBeforeReboot { | |
| cancel() | |
| } | |
| if _, err = gnoiClient.System().Reboot(ctxWithTimeout, rebootRequest); err != nil { | |
| // This error is expected because the connection is broken on reboot. | |
| t.Logf("Reboot request failed as expected, or with an unexpected error: %v", err) | |
| } | |
| // Wait for the device to become reachable again and for the boot time to update. | |
| tc.waitForReboot(t, bootTimeBeforeReboot) |
| func (tc *testCase) configureToStoreRunninggNMIConfig(t *testing.T) error { | ||
| hwProfileConfig := map[enpb.VendorId]string{ | ||
| enpb.VendorId_V_ARISTA: "management api gnmi \n transport grpc default \n operation set persistence \n", | ||
| } | ||
| statement.GetOrCreateActions().PolicyResult = applyPolicyType | ||
| gnmi.Update(t, dut, gnmi.OC().RoutingPolicy().Config(), routePolicy) | ||
| tc.dut.Config().New(). | ||
| WithAristaText(hwProfileConfig[enpb.VendorId_V_ARISTA]). | ||
| WithCiscoText(hwProfileConfig[enpb.VendorId_V_CISCOXR]). | ||
| WithJuniperText(hwProfileConfig[enpb.VendorId_V_JUNIPER]). | ||
| Append(t) | ||
| t.Logf("hwProfileConfig: %v added \n", hwProfileConfig[enpb.VendorId_V_ARISTA]) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function configureToStoreRunninggNMIConfig is defined but it is not called anywhere in the test. This seems to be an oversight. If this configuration is necessary for the test to work correctly after a reboot, it should be called, probably at the beginning of configureDUT.
Additionally, the hwProfileConfig map only contains a configuration for Arista. For other vendors like Cisco and Juniper, it will pass an empty string. If this is intentional because no configuration is needed for them, please add a comment to clarify. If configurations are pending, a TODO comment would be helpful.
Example:
hwProfileConfig := map[enpb.VendorId]string{
enpb.VendorId_V_ARISTA: "management api gnmi \n transport grpc default \n operation set persistence \n",
// enpb.VendorId_V_CISCOXR: "", // No equivalent config needed.
// enpb.VendorId_V_JUNIPER: "", // No equivalent config needed.
}
Pull Request Test Coverage Report for Build 20711325825Details
💛 - Coveralls |
Pull Request Functional Test Report for #4969 / f3e03a6Virtual Devices
Hardware Devices
|
ElodinLaarz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most of these changes should not happen. Can you try to make the changes in the cfgplugin library if the test has issues with ISIS / BGP?
| if deviations.MultipathUnsupportedNeighborOrAfisafi(dut) { | ||
| peerGroupV4.GetOrCreateUseMultiplePaths().SetEnabled(true) | ||
| peerGroupV6.GetOrCreateUseMultiplePaths().SetEnabled(true) | ||
| } else { | ||
| afiSAFI.GetOrCreateUseMultiplePaths().GetOrCreateEbgp().SetMaximumPaths(2) | ||
| asisafi6.GetOrCreateUseMultiplePaths().GetOrCreateEbgp().SetMaximumPaths(2) | ||
| peerGroupV4AfiSafi.GetOrCreateUseMultiplePaths().SetEnabled(true) | ||
| peerGroupV6AfiSafi.GetOrCreateUseMultiplePaths().SetEnabled(true) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle the deviation in its own function and handle all vendors, explicitly. (I can't tell what vendors should have this value set or not as-is.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. will modify.
| "github.com/openconfig/featureprofiles/internal/cfgplugins" | ||
| "github.com/openconfig/featureprofiles/internal/deviations" | ||
| "github.com/openconfig/featureprofiles/internal/fptest" | ||
| "github.com/openconfig/featureprofiles/internal/isissession" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This package is deprecated. You should not be using it in new changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
| global := isis.GetOrCreateGlobal() | ||
| global.HelloPadding = oc.Isis_HelloPaddingType_DISABLE | ||
|
|
||
| if deviations.ISISSingleTopologyRequired(ts.DUT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here about properly handling deviations. They should be handled explicitly in their own functions and we should be able to see which vendor(s) are expected to have the value set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modify accordingly.
| } | ||
|
|
||
| // configureDUT configures all the interfaces, BGP, and ISIS on the DUT. | ||
| func createBGPNeighbor(peerGrpNameV4, peerGrpNameV6 string, nbrs []*BGPNeighbor, dut *ondatra.DUTDevice) *oc.NetworkInstance_Protocol { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't in-line the BGP configuration-- please use / update the cfgplugins library so the config can be used across tests.
|
I don't think you uploaded any changes you made? Could you commit and upload before I can review it again? |
Pease review AFTS reboot related changes.