Skip to content

Admin cutover improve the pod label watch function#881

Merged
spai-p9 merged 5 commits intomainfrom
private/main/ghi-877
Sep 10, 2025
Merged

Admin cutover improve the pod label watch function#881
spai-p9 merged 5 commits intomainfrom
private/main/ghi-877

Conversation

@spai-p9
Copy link
Copy Markdown
Collaborator

@spai-p9 spai-p9 commented Sep 8, 2025

What this PR does / why we need it

This PR enhances pod label watch functionality with improved logging, error handling, and timeout mechanisms. It implements better handling for non-pod events, channel blockages, and adds a 24-hour timeout. The changes include graceful context cancellation handling and systematic retry mechanisms, significantly improving the reliability and observability of the pod cutover process.

Which issue(s) this PR fixes

(optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged)

fixes #877

Special notes for your reviewer

Testing done

Uploading image.png…

As you can see in the screenshot the migration for centos6 was crearted at 09/09/2025, 01:28 and time elapsed for this migration to finsih was 9h 22m,
Logs of the migration:
2025/09/08 21:09:04 Admin initiated cutover detected, skipping changed blocks copy
Info: Received event for pod v2v-helper-centos-6-vj-test-7f595-4b125-595ts: type=MODIFIED
2025/09/08 21:09:04 Waiting for Admin Cutover conditions to be met
Info: Received event for pod v2v-helper-centos-6-vj-test-7f595-4b125-595ts: type=MODIFIED
2025/09/09 05:21:15 Label: yes
Info: Received event for pod v2v-helper-centos-6-vj-test-7f595-4b125-595ts: type=MODIFIED
Info: Label changed for pod v2v-helper-centos-6-vj-test-7f595-4b125-595ts: no -> yes
Info: Sent label yes for pod v2v-helper-centos-6-vj-test-7f595-4b125-595ts to channel
2025/09/09 05:21:15 Cutover conditions met
2025/09/09 05:21:15 Shutting down source VM and performing final copy

migration was waiting for cutover from 21:09:04 from 8th of sept to 05:21:15 9th sept which is more than 9hours and the admicutover was succesfully done. hence the fix we have verified it works.

please add testing details (logs, screenshots, etc.)

Summary by Bito

This PR enhances pod label watch functionality with improved logging, error handling, and timeout mechanisms. It implements better handling for non-pod events, channel blockages, and adds a 24-hour timeout. The changes include graceful context cancellation handling and systematic retry mechanisms, significantly improving the reliability and observability of the pod cutover process.

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Sep 8, 2025

Code Review Agent Run #329c1b

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: a345595..957c074
    • v2v-helper/reporter/reporter.go
  • Files skipped - 0
  • Tools
    • Golangci-lint (Linter) - ✖︎ Failed
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Default Agent You can customize the agent settings here or contact your Bito workspace admin at mithil@platform9.com.

Documentation & Help

AI Code Review powered by Bito Logo

Copy link
Copy Markdown
Contributor

@windsurf-bot windsurf-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 🤙

💡 To request another review, post a new comment with "/windsurf-review".

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Sep 8, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Enhanced Pod Label Watch Functionality

reporter.go - Refactored the WatchPodLabels function to incorporate a timeout mechanism, improved error management, enhanced logging, and restructured the event loop for better handling of pod label changes.

Copy link
Copy Markdown
Contributor

@bito-code-review bito-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #0bb55a

Actionable Suggestions - 1
  • v2v-helper/reporter/reporter.go - 1
    • Deferred watch.Stop() won't execute per iteration · Line 255-255
Review Details
  • Files reviewed - 1 · Commit Range: 957c074..980a003
    • v2v-helper/reporter/reporter.go
  • Files skipped - 0
  • Tools
    • Golangci-lint (Linter) - ✖︎ Failed
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Default Agent You can customize the agent settings here or contact your Bito workspace admin at mithil@platform9.com.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread v2v-helper/reporter/reporter.go
Comment on lines +245 to +253
watch, err := r.Clientset.CoreV1().Pods(r.PodNamespace).Watch(ctx, metav1.ListOptions{
FieldSelector: fmt.Sprintf("metadata.name=%s", r.PodName),
TimeoutSeconds: &timeoutSeconds,
})
if err != nil {
fmt.Printf("Error: Failed to start watch for pod %s: %v\n", r.PodName, err)
time.Sleep(5 * time.Second)
continue
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spai-p9 Won't this create infinite watchers?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will hog resources on the pod

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only creates one watcher and the inner loop is a blocking for loop which does not exits until the channel gets closed, hence only one watch is present at a given particular time. the outer for loop is to ensure that if a channel gets closed due to some reason we create a new watch so we don't miss any update.

Copy link
Copy Markdown
Collaborator

@OmkarDeshpande7 OmkarDeshpande7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good.
Please add more testing details. Possibly snippet/video that indicates adminCutOver was triggered after waiting for at-least 7-8 hours.

@spai-p9
Copy link
Copy Markdown
Collaborator Author

spai-p9 commented Sep 10, 2025

Added the logs and snippets in the issue and in the testing notes as well.

@spai-p9 spai-p9 enabled auto-merge (squash) September 10, 2025 12:20
@spai-p9 spai-p9 merged commit fa55324 into main Sep 10, 2025
12 checks passed
@spai-p9 spai-p9 deleted the private/main/ghi-877 branch September 10, 2025 12:29
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Sep 10, 2025

Bito Automatic Review Failed - Technical Failure

Bito encountered technical difficulties while generating code feedback . To retry, type /review in a comment and save. If the issue persists, contact support@bito.ai and provide the following details:

Agent Run ID: 941a684c-4ccf-4fb6-9ea5-a3703219cb57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Admin cutover not getting triggered for long running migration

2 participants