Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions test/e2e/handler/bonding_default_interface_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ var _ = Describe("NodeNetworkConfigurationPolicy bonding default interface", fun
nodeToReboot := nodes[0]
Byf("Reboot node %s and verify that bond still has ip of primary nic", nodeToReboot)
restartNodeWithoutWaiting(nodeToReboot)
waitForNodeToStart(nodeToReboot)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The waitForNodeToStart function has a logic issue that prevents it from waiting for the node to reboot. When the node is down and unreachable, runner.RunAtNode inside waitForNodeToStart will fail. The function will then return the string "not yet". The Eventually block checks ShouldNot(Equal("up")). Since "not yet" is not equal to "up", the check passes, and waitForNodeToStart returns immediately without actually waiting for the node to be ready. This defeats the purpose of adding this wait.

The current implementation of waitForNodeToStart in test/e2e/handler/utils.go is:

func waitForNodeToStart(node string) {
	Byf("Waiting till node %s is rebooted", node)
	// It will wait till uptime -p will return up that means that node was currently rebooted and is 0 min up
	Eventually(func() string {
		output, err := runner.RunAtNode(node, "uptime", "-p")
		if err != nil {
			return "not yet"
		}
		return output
	}, 300*time.Second, 5*time.Second).ShouldNot(Equal("up"), fmt.Sprintf("Node %s failed to start after reboot", node))
}

To correctly wait for the node to be ready, the function should wait for the runner.RunAtNode command to succeed. A corrected implementation would look like this:

func waitForNodeToStart(node string) {
	Byf("Waiting till node %s is rebooted", node)
	Eventually(func() error {
		_, err := runner.RunAtNode(node, "uptime")
		return err
	}, 5*time.Minute, 5*time.Second).Should(Succeed(), fmt.Sprintf("Node %s failed to start after reboot", node))
}

Since utils.go is not part of this PR's changes, I recommend also including the fix for waitForNodeToStart in this pull request for the intended fix to be effective.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qinqon is this a valid one or yet another hallucination?


By("Wait for policy re-reconciled after node reboot")
policy.WaitForPolicyTransitionUpdate(TestPolicy)
Expand Down
1 change: 1 addition & 0 deletions test/e2e/handler/default_bridged_network_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ var _ = Describe("NodeNetworkConfigurationPolicy default bridged network", func(
nodeToReboot := nodes[0]

restartNodeWithoutWaiting(nodeToReboot)
waitForNodeToStart(nodeToReboot)

By("Wait for policy re-reconciled after node reboot")
policy.WaitForPolicyTransitionUpdate(DefaultNetwork)
Expand Down
1 change: 1 addition & 0 deletions test/e2e/handler/default_ovs_bridged_network_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,7 @@ var _ = Describe("NodeNetworkConfigurationPolicy default ovs-bridged network", f

It("should keep the default IP address after node reboot", func() {
restartNodeWithoutWaiting(node)
waitForNodeToStart(node)

By("Wait for policy re-reconciled after node reboot")
policy.WaitForPolicyTransitionUpdate(ovsDefaultNetwork)
Expand Down
9 changes: 4 additions & 5 deletions test/e2e/handler/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -283,13 +283,12 @@ func restartNodeWithoutWaiting(node string) {
func waitForNodeToStart(node string) {
Byf("Waiting till node %s is rebooted", node)
// It will wait till uptime -p will return up that means that node was currently rebooted and is 0 min up
Eventually(func() string {
Eventually(func(g Gomega) string {
output, err := runner.RunAtNode(node, "uptime", "-p")
if err != nil {
return "not yet"
}
// Using g.Expect makes Eventually retry on errors directly
g.Expect(err).NotTo(HaveOccurred())
return output
}, 300*time.Second, 5*time.Second).ShouldNot(Equal("up"), fmt.Sprintf("Node %s failed to start after reboot", node))
}, 300*time.Second, 5*time.Second).Should(ContainSubstring("up"), fmt.Sprintf("Node %s failed to start after reboot", node))
}

func createDummyConnection(nodesToModify []string, dummyName string) []error {
Expand Down