Skip to content

Migration dest vm cleanup refactor#6766

Open
hholoubk wants to merge 1 commit intoautotest:masterfrom
hholoubk:migration_dest_vm_cleanup_refactor
Open

Migration dest vm cleanup refactor#6766
hholoubk wants to merge 1 commit intoautotest:masterfrom
hholoubk:migration_dest_vm_cleanup_refactor

Conversation

@hholoubk
Copy link
Copy Markdown
Contributor

@hholoubk hholoubk commented Jan 15, 2026

Depends on: avocado-framework/avocado-vt#4331

Summary by CodeRabbit

  • New Features

    • Optional destination VM backup before live migration, with automatic restoration after migration and conditional handling for newer libvirt metadata.
    • New configuration option to enable destination VM backup during migration.
  • Tests

    • Migration tests now establish remote sessions, exercise the destination VM backup/restore flow, and ensure proper session teardown.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 15, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a top-level config key do_destination_vm_backup. Introduces setup_remote_virsh_session(params) and changes check_iommu_xml to accept an external remote_virsh session which the migration test now creates and closes. Extends MigrationBase.__init__ to accept remote_virsh and adds server_ip, server_user, server_pwd, do_destination_vm_backup, and dest_backup_xml attributes. Implements optional pre-migration backup and conditional undefine of a destination VM, and post-test restoration (SCP of XML and remote re-define). Cleanup now closes remote_virsh and restores backups when present.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Migration dest vm cleanup refactor' accurately describes the main objective of the PR, which refactors cleanup logic for destination VM handling during migration as evidenced by the changes to migration_iommu_device files and base_steps.py.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@hholoubk hholoubk force-pushed the migration_dest_vm_cleanup_refactor branch from 206b810 to 8d86f36 Compare January 16, 2026 10:47
@hholoubk hholoubk force-pushed the migration_dest_vm_cleanup_refactor branch 2 times, most recently from 3a5e195 to 4bc95aa Compare February 9, 2026 13:39
@hholoubk hholoubk marked this pull request as ready for review February 17, 2026 11:53
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
libvirt/tests/src/sriov/vIOMMU/migration_iommu_device.py (1)

16-34: Clean helper for remote virsh session setup.

The function correctly creates a VirshPersistent session with the standard connection parameters. Consider whether this helper could be promoted to a shared utility (e.g., in base_steps.py) since other migration tests may need the same pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libvirt/tests/src/sriov/vIOMMU/migration_iommu_device.py` around lines 16 -
34, The helper setup_remote_virsh_session creates a VirshPersistent using
server_ip/server_user/server_pwd and should be extracted to a shared utility so
other migration tests can reuse it; move the function into a common module
(e.g., base_steps.py or a new virsh_helpers module), keep the same signature
(params dict) and behavior, update callers to import and call
setup_remote_virsh_session from the shared location, and ensure the returned
type remains virsh.VirshPersistent and any tests importing the old local
function are updated to reference the new shared function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libvirt/tests/src/sriov/vIOMMU/migration_iommu_device.py`:
- Around line 86-89: remote_virsh created by setup_remote_virsh_session is
allocated before calling MigrationBase(test, vm, params,
remote_virsh=remote_virsh) so if MigrationBase.__init__ (e.g.,
_backup_and_undefine_destination_vm_if_needed) raises the remote session can
leak; fix by moving setup_remote_virsh_session(params) inside the same try that
eventually calls migration_obj and ensure remote_virsh.close_session() is called
in the finally (or wrap both creation and MigrationBase(...) in a try/finally),
referencing setup_remote_virsh_session, remote_virsh, MigrationBase and
close_session() so the session is always closed even if MigrationBase.__init__
fails.
- Around line 104-109: Remove the duplicate cleanup_default() invocation in the
finally block: migration_obj.cleanup_connection() already calls
migration_obj.cleanup_default(), so delete the explicit
migration_obj.cleanup_default() call in migration_iommu_device.py to avoid
running migration_test.cleanup_vm(), orig_config_xml.sync(), and any restore of
dest_backup_xml twice; keep migration_obj.cleanup_connection(),
test_obj.teardown_iommu_test(), and remote_virsh.close_session() as-is.

In `@provider/migration/base_steps.py`:
- Around line 111-115: The code uses libvirt_version.version_compare(7, 5, 0)
(local virsh.version()) but calls self.remote_virsh.undefine(vm_name,
options=undef_opts), causing mismatched option usage and no error handling; fix
by querying the remote libvirt version via
self.remote_virsh.version()/remote_virsh.version() and base the
"--checkpoints-metadata" append on that result, then wrap the call to
self.remote_virsh.undefine(...) in a try/except to catch failures (or catch
unsupported-option errors) and log/process the exception and non-success return
so an undefined VM failure surfaces immediately instead of failing later. Ensure
references to libvirt_version.version_compare,
self.remote_virsh.version()/self.remote_virsh.undefine, and vm_name are used to
locate and update the logic.

---

Nitpick comments:
In `@libvirt/tests/src/sriov/vIOMMU/migration_iommu_device.py`:
- Around line 16-34: The helper setup_remote_virsh_session creates a
VirshPersistent using server_ip/server_user/server_pwd and should be extracted
to a shared utility so other migration tests can reuse it; move the function
into a common module (e.g., base_steps.py or a new virsh_helpers module), keep
the same signature (params dict) and behavior, update callers to import and call
setup_remote_virsh_session from the shared location, and ensure the returned
type remains virsh.VirshPersistent and any tests importing the old local
function are updated to reference the new shared function.

Comment thread libvirt/tests/src/sriov/vIOMMU/migration_iommu_device.py Outdated
Comment thread libvirt/tests/src/sriov/vIOMMU/migration_iommu_device.py
Comment on lines +111 to +115
undef_opts = "--managed-save --snapshots-metadata --nvram"
if libvirt_version.version_compare(7, 5, 0):
undef_opts += " --checkpoints-metadata"

self.remote_virsh.undefine(vm_name, options=undef_opts)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find libvirt_version module and understand version_compare
find . -type f -name "*.py" | xargs grep -l "def version_compare" | head -5

Repository: autotest/tp-libvirt

Length of output: 92


🏁 Script executed:

# Search for libvirt_version import and usage patterns
rg "libvirt_version\.version_compare" --type py -B2 -A2 | head -40

Repository: autotest/tp-libvirt

Length of output: 3811


🏁 Script executed:

# Find remote_virsh definition and how it works
rg "remote_virsh\s*=" --type py -B3 -A3 | head -50

Repository: autotest/tp-libvirt

Length of output: 3886


🏁 Script executed:

# Look for error handling patterns on virsh.undefine calls in the codebase
rg "\.undefine\(" --type py -B1 -A2 | head -50

Repository: autotest/tp-libvirt

Length of output: 3796


🏁 Script executed:

cat -n ./provider/libvirt_version.py | head -100

Repository: autotest/tp-libvirt

Length of output: 2179


🏁 Script executed:

# Now let's check the full context of the method in base_steps.py
sed -n '78,120p' provider/migration/base_steps.py

Repository: autotest/tp-libvirt

Length of output: 2037


🏁 Script executed:

# Let's check how virsh.version() is called and what remote_virsh type is
rg "class.*VirshPersistent\|def version" provider/migration/base_steps.py -B2 -A5

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Check virsh module to understand VirshPersistent and how version() works
rg "class VirshPersistent\|def version" virttest/ -B2 -A10 | head -80

Repository: autotest/tp-libvirt

Length of output: 112


🏁 Script executed:

# Let's search for any remote version checking patterns in the codebase
rg "remote.*version_compare\|remote_virsh.*version" --type py -C3

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Check if there are other examples of checking remote libvirt version
rg "remote_virsh\." provider/ --type py -B2 -A2 | grep -E "(version|libvirt_version)" -B2 -A2 | head -30

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Look for virsh.py or virsh module in the repository
find . -type f -name "virsh.py" | head -5

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Look for examples of getting remote version through remote_virsh
rg "remote_virsh.*version\(\)" --type py -B2 -A2

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Check if virsh.version() can be called on remote_virsh instance
rg "def version" --type py -A10 | head -50

Repository: autotest/tp-libvirt

Length of output: 865


🏁 Script executed:

# Search for how other remote operations handle version-specific options
rg "remote_virsh\.(define|create|start|migrate)" --type py -B3 -A3 | head -80

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Search for imports and understand what remote_virsh is - look at the imports in base_steps.py
head -30 provider/migration/base_steps.py

Repository: autotest/tp-libvirt

Length of output: 815


🏁 Script executed:

# Check if VirshPersistent has a version method or version command
rg "VirshPersistent\|class Virsh" --type py -A20 | head -100

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Look at how remote_virsh is instantiated and used in other files
cat v2v/tests/src/convert_vm_to_libvirt.py | grep -A15 "remote_virsh = virsh"

Repository: autotest/tp-libvirt

Length of output: 652


🏁 Script executed:

# Search for how Virsh class is structured - look for its methods
find . -path ./libvirt -prune -o -type f -name "*.py" -exec grep -l "class Virsh\|class VirshPersistent" {} \;

Repository: autotest/tp-libvirt

Length of output: 113


🏁 Script executed:

# Let's understand if virsh.version() is available on remote_virsh objects
rg "def version\(" --type py -A5 | head -60

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Check if version_compare is ever passed a virsh instance parameter
rg "version_compare\(" --type py -B1 -A1 | head -80

Repository: autotest/tp-libvirt

Length of output: 5927


🏁 Script executed:

# Let me look at how the migration test uses version comparisons with remote operations
rg "remote_virsh.*define\|remote_virsh.*create" provider/ --type py -B5 -A5 | head -80

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's verify if VirshPersistent can call version() on remote
rg "class VirshPersistent" --type py -A50 | head -80

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Check if there are other similar cases where version_compare is used with remote operations
rg "remote_virsh.*undefine\|remote_virsh.*define\|remote_virsh.*create" provider/ --type py -B10 -A2 | head -100

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Look for how error checking is done on remote_virsh operations elsewhere
rg "remote_virsh\.\w+.*exit_status\|remote_virsh\.\w+.*ignore_status" provider/ --type py -B2 -A2

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's look at the imports and understand what virsh module is being used
rg "from virttest import virsh\|import virsh" provider/migration/base_steps.py -A5

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's check what methods remote_virsh (VirshPersistent) can call
rg "remote_virsh\." provider/migration/base_steps.py | head -20

Repository: autotest/tp-libvirt

Length of output: 258


🏁 Script executed:

# Let's see if there's documentation or examples of getting remote version
rg "Virsh\|version\(\)" --type py | grep -E "(def version|remote.*version)" | head -20

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's examine similar migration-related code that handles remote operations
grep -r "do_destination_vm_backup\|destination.*undefine" provider/ --type py -B5 -A5

Repository: autotest/tp-libvirt

Length of output: 180


🏁 Script executed:

# Let's look at the actual Virsh/VirshPersistent implementation to understand what methods are available
find . -type f -name "*.py" -exec grep -l "class Virsh\|class VirshPersistent" {} \;

Repository: autotest/tp-libvirt

Length of output: 154


🏁 Script executed:

# Check if there's any code that gets remote version
rg "remote.*version\|version.*remote" provider/ --type py -B3 -A3

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's see the complete implementation of remote_virsh in the method
sed -n '40,120p' provider/migration/base_steps.py

Repository: autotest/tp-libvirt

Length of output: 3548


🏁 Script executed:

# Check what happens when undefine fails - does it throw an exception or return a result object?
rg "def undefine" --type py -A15 | head -50

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's look at the Virsh class definition to understand what methods are available
cat provider/virsh_cmd_check/virsh_cmd_check_base.py | head -100

Repository: autotest/tp-libvirt

Length of output: 3909


🏁 Script executed:

# Search for how snapshot_delete is used and if it's similar
rg "snapshot_delete\|snapshot.*undefine" provider/ libvirt/tests/ --type py -B2 -A3 | head -40

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's check if remote_virsh can access version() method
rg "\.version\(\)" --type py -B3 -A3 | head -60

Repository: autotest/tp-libvirt

Length of output: 541


🏁 Script executed:

# Look at test files to understand if there's a pattern for handling version-specific remote commands
rg "remote_virsh.*options\|remote_virsh.*define" libvirt/tests/src/virsh_cmd/domain/ --type py -B3 -A3 | head -80

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's verify the learning context - check snapshot_delete usage
rg "snapshot_delete" libvirt/tests/src/virtual_disks/virtual_disks_io_error.py -B3 -A3

Repository: autotest/tp-libvirt

Length of output: 311


🏁 Script executed:

# Let's check if there's any precedent for handling remote version differences with try/except
rg "try:\s*remote_virsh\|try:\s*self.remote_virsh" provider/ --type py -B2 -A8 | head -60

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Final check: Look for any places where remote version is actually queried
rg "remote.*virsh.*version\|virsh.*Virsh.*version" --type py -B3 -A3 | head -40

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's check if undefine returns a result object and can be checked
rg "\.undefine\(" provider/migration/base_steps.py -B2 -A5

Repository: autotest/tp-libvirt

Length of output: 272


🏁 Script executed:

# Check if there's error checking on other remote_virsh calls in the file
rg "remote_virsh\.\w+\(" provider/migration/base_steps.py -A2 | head -40

Repository: autotest/tp-libvirt

Length of output: 501


🏁 Script executed:

# Let's verify if virsh commands can be called with try/except pattern for unsupported options
rg "try:\|except.*error\|except.*Exit" libvirt/tests/src/virsh_cmd/domain/ --type py -B2 -A2 | head -80

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's check if libvirt_version has any functions that can work with remote sessions
cat provider/libvirt_version.py

Repository: autotest/tp-libvirt

Length of output: 1815


🏁 Script executed:

# Check if there are migration tests that handle version compatibility
rg "migrate.*version\|version.*7.*5.*0" libvirt/tests/src/virsh_cmd/domain/ --type py -B3 -A3 | head -60

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's check if checkpoints-metadata flag is used elsewhere and how errors are handled
rg "checkpoints-metadata" --type py -B5 -A5

Repository: autotest/tp-libvirt

Length of output: 960


Version check uses local libvirt, but undefine runs on the remote host.

libvirt_version.version_compare(7, 5, 0) queries the local host's libvirt version (via virsh.version()), yet the undefine command executes on the remote host via self.remote_virsh. If the local and remote hosts run different libvirt versions, this creates a mismatch: passing --checkpoints-metadata to a remote host with libvirt < 7.5 will fail with an unsupported-option error, and omitting it on a remote host with >= 7.5 misses required cleanup.

Additionally, the undefine result is not checked — if it fails silently, the destination VM remains undefined, causing migration to fail later with a confusing error.

Consider wrapping the undefine call in a try/except to gracefully handle unsupported options, or query the remote libvirt version first using the remote virsh instance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@provider/migration/base_steps.py` around lines 111 - 115, The code uses
libvirt_version.version_compare(7, 5, 0) (local virsh.version()) but calls
self.remote_virsh.undefine(vm_name, options=undef_opts), causing mismatched
option usage and no error handling; fix by querying the remote libvirt version
via self.remote_virsh.version()/remote_virsh.version() and base the
"--checkpoints-metadata" append on that result, then wrap the call to
self.remote_virsh.undefine(...) in a try/except to catch failures (or catch
unsupported-option errors) and log/process the exception and non-success return
so an undefined VM failure surfaces immediately instead of failing later. Ensure
references to libvirt_version.version_compare,
self.remote_virsh.version()/self.remote_virsh.undefine, and vm_name are used to
locate and update the logic.

@hholoubk
Copy link
Copy Markdown
Contributor Author

before PR:
(1/1) type_specific.io-github-autotest-libvirt.vIOMMU.migration.iommu_device.virtio_muti_devices.vhost_on.virtio: STARTED
(1/1) type_specific.io-github-autotest-libvirt.vIOMMU.migration.iommu_device.virtio_muti_devices.vhost_on.virtio: FAIL: error: operation failed: domain 'avocado-vt-vm1' already exists with uuid f49410a9-3a9e-42b5-8d07-1f78a62e7da0 (81.99 s)

after:

(1/1) type_specific.io-github-autotest-libvirt.vIOMMU.migration.iommu_device.virtio_muti_devices.vhost_on.virtio: STARTED
(1/1) type_specific.io-github-autotest-libvirt.vIOMMU.migration.iommu_device.virtio_muti_devices.vhost_on.virtio: PASS (242.71 s)

@hholoubk
Copy link
Copy Markdown
Contributor Author

@coderabbitai configuration

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 24, 2026

Configuration used: Organization UI

CodeRabbit configuration
language: en-US
tone_instructions: ''
early_access: false
enable_free_tier: true
inheritance: false
reviews:
  profile: chill
  request_changes_workflow: false
  high_level_summary: true
  high_level_summary_instructions: ''
  high_level_summary_placeholder: '@coderabbitai summary'
  high_level_summary_in_walkthrough: false
  auto_title_placeholder: '@coderabbitai'
  auto_title_instructions: ''
  review_status: true
  review_details: false
  commit_status: true
  fail_commit_status: false
  collapse_walkthrough: false
  changed_files_summary: false
  sequence_diagrams: false
  estimate_code_review_effort: true
  assess_linked_issues: true
  related_issues: false
  related_prs: false
  suggested_labels: false
  labeling_instructions: []
  auto_apply_labels: false
  suggested_reviewers: false
  auto_assign_reviewers: false
  in_progress_fortune: true
  poem: false
  enable_prompt_for_ai_agents: true
  path_filters: []
  path_instructions: []
  abort_on_close: true
  disable_cache: false
  auto_review:
    enabled: true
    description_keyword: ''
    auto_incremental_review: true
    auto_pause_after_reviewed_commits: 5
    ignore_title_keywords: []
    labels: []
    drafts: false
    base_branches: []
    ignore_usernames: []
  finishing_touches:
    docstrings:
      enabled: true
    unit_tests:
      enabled: true
    custom: []
  pre_merge_checks:
    docstrings:
      mode: warning
      threshold: 80
    title:
      mode: warning
      requirements: ''
    description:
      mode: warning
    issue_assessment:
      mode: warning
    custom_checks: []
  tools:
    ast-grep:
      rule_dirs: []
      util_dirs: []
      essential_rules: true
      packages: []
    shellcheck:
      enabled: true
    ruff:
      enabled: true
    markdownlint:
      enabled: true
    github-checks:
      enabled: true
      timeout_ms: 90000
    languagetool:
      enabled: true
      enabled_rules: []
      disabled_rules: []
      enabled_categories: []
      disabled_categories: []
      enabled_only: false
      level: default
    biome:
      enabled: true
    hadolint:
      enabled: true
    swiftlint:
      enabled: true
    phpstan:
      enabled: true
      level: default
    phpmd:
      enabled: true
    phpcs:
      enabled: true
    golangci-lint:
      enabled: true
    yamllint:
      enabled: true
    gitleaks:
      enabled: true
    trufflehog:
      enabled: true
    checkov:
      enabled: true
    tflint:
      enabled: true
    detekt:
      enabled: true
    eslint:
      enabled: true
    flake8:
      enabled: true
    fortitudeLint:
      enabled: true
    rubocop:
      enabled: true
    buf:
      enabled: true
    regal:
      enabled: true
    actionlint:
      enabled: true
    pmd:
      enabled: true
    clang:
      enabled: true
    cppcheck:
      enabled: true
    opengrep:
      enabled: true
    semgrep:
      enabled: true
    circleci:
      enabled: true
    clippy:
      enabled: true
    sqlfluff:
      enabled: true
    trivy:
      enabled: true
    prismaLint:
      enabled: true
    pylint:
      enabled: true
    oxc:
      enabled: true
    shopifyThemeCheck:
      enabled: true
    luacheck:
      enabled: true
    brakeman:
      enabled: true
    dotenvLint:
      enabled: true
    htmlhint:
      enabled: true
    stylelint:
      enabled: true
    checkmake:
      enabled: true
    osvScanner:
      enabled: true
    blinter:
      enabled: true
    psscriptanalyzer:
      enabled: true
chat:
  art: true
  auto_reply: true
  integrations:
    jira:
      usage: auto
    linear:
      usage: auto
knowledge_base:
  opt_out: false
  web_search:
    enabled: true
  code_guidelines:
    enabled: true
    filePatterns: []
  learnings:
    scope: auto
  issues:
    scope: auto
  jira:
    usage: auto
    project_keys: []
  linear:
    usage: auto
    team_keys: []
  pull_requests:
    scope: auto
  mcp:
    usage: auto
    disabled_servers: []
  linked_repositories: []
code_generation:
  docstrings:
    language: en-US
    path_instructions: []
  unit_tests:
    path_instructions: []
issue_enrichment:
  auto_enrich:
    enabled: false
  planning:
    enabled: true
    auto_planning:
      enabled: true
      labels: []
  labeling:
    labeling_instructions: []
    auto_apply_labels: false

@hholoubk hholoubk force-pushed the migration_dest_vm_cleanup_refactor branch from ea16735 to 4b4c9fc Compare March 2, 2026 09:26
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
provider/migration/base_steps.py (1)

111-115: ⚠️ Potential issue | 🟠 Major

Use destination capabilities when selecting undefine options.

Line 112 checks local libvirt, but Line 115 executes on the destination via self.remote_virsh. On mixed-version hosts this can pass unsupported options (or miss required ones).

Suggested fix
-        undef_opts = "--managed-save --snapshots-metadata --nvram"
-        if libvirt_version.version_compare(7, 5, 0):
-            undef_opts += " --checkpoints-metadata"
-
-        self.remote_virsh.undefine(vm_name, options=undef_opts)
+        base_undef_opts = "--managed-save --snapshots-metadata --nvram"
+        opts_with_checkpoints = f"{base_undef_opts} --checkpoints-metadata"
+        result = self.remote_virsh.undefine(
+            vm_name, options=opts_with_checkpoints, ignore_status=True, debug=True
+        )
+        if result.exit_status and "checkpoints-metadata" in result.stderr_text:
+            result = self.remote_virsh.undefine(
+                vm_name, options=base_undef_opts, ignore_status=True, debug=True
+            )
+        if result.exit_status:
+            self.test.fail(
+                f"Failed to undefine destination VM '{vm_name}': {result.stderr_text}"
+            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@provider/migration/base_steps.py` around lines 111 - 115, The undefine
options are being chosen based on the local libvirt version
(libvirt_version.version_compare) but then applied on the destination via
self.remote_virsh.undefine(vm_name, options=undef_opts); change the logic to
detect the destination's capabilities instead: query the remote virsh/libvirt
version or capabilities from self.remote_virsh (or its connection object) and
use that result when building undef_opts (e.g., only append "
--checkpoints-metadata" when the destination reports libvirt >= 7.5.0 and
otherwise avoid adding unsupported flags), keeping vm_name and
self.remote_virsh.undefine as the call site.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libvirt/tests/src/sriov/vIOMMU/migration_iommu_device.py`:
- Around line 105-110: The finally block should not call
migration_obj.cleanup_connection(), remote_virsh.close_session(), and
test_obj.teardown_iommu_test() directly because one exception can prevent
subsequent cleanup; instead, create a list (e.g., cleanup_handlers) and append
lambdas or callables for migration_obj.cleanup_connection,
remote_virsh.close_session, and test_obj.teardown_iommu_test at the points where
those resources are allocated, then in the finally iterate over cleanup_handlers
and call each inside its own try/except to log any exceptions without aborting
the remaining handlers so all three (migration_obj.cleanup_connection(),
remote_virsh.close_session(), test_obj.teardown_iommu_test()) always get
attempted.

In `@provider/migration/base_steps.py`:
- Around line 395-411: The restore block using self.dest_backup_xml (calls to
remote.scp_to_remote and self.dest_backup_xml.define) can raise exceptions and
prevent the subsequent self.orig_config_xml.sync() from running; wrap the entire
destination-VM restore sequence (the scp_to_remote call and the define call on
self.dest_backup_xml using virsh_instance=self.remote_virsh) in a
try/except/finally so any exceptions are caught and logged via self.test.log (or
re-raised after cleanup) and ensure self.orig_config_xml.sync() is executed in
the finally block to guarantee source config sync; reference the symbols
dest_backup_xml, remote.scp_to_remote, dest_backup_xml.define, remote_virsh, and
orig_config_xml.sync when locating the code to change.

---

Duplicate comments:
In `@provider/migration/base_steps.py`:
- Around line 111-115: The undefine options are being chosen based on the local
libvirt version (libvirt_version.version_compare) but then applied on the
destination via self.remote_virsh.undefine(vm_name, options=undef_opts); change
the logic to detect the destination's capabilities instead: query the remote
virsh/libvirt version or capabilities from self.remote_virsh (or its connection
object) and use that result when building undef_opts (e.g., only append "
--checkpoints-metadata" when the destination reports libvirt >= 7.5.0 and
otherwise avoid adding unsupported flags), keeping vm_name and
self.remote_virsh.undefine as the call site.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea16735 and 4b4c9fc.

📒 Files selected for processing (3)
  • libvirt/tests/cfg/sriov/vIOMMU/migration_iommu_device.cfg
  • libvirt/tests/src/sriov/vIOMMU/migration_iommu_device.py
  • provider/migration/base_steps.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • libvirt/tests/cfg/sriov/vIOMMU/migration_iommu_device.cfg

Comment on lines 105 to 110
finally:
migration_obj.cleanup_connection()
if migration_obj:
migration_obj.cleanup_connection()
if remote_virsh:
remote_virsh.close_session()
test_obj.teardown_iommu_test()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Isolate finally-block cleanup steps so one failure doesn’t skip the rest.

If migration_obj.cleanup_connection() raises, Line 109 and Line 110 won’t run, which can leak the session and skip teardown.

Suggested fix
     finally:
-        if migration_obj:
-            migration_obj.cleanup_connection()
-        if remote_virsh:
-            remote_virsh.close_session()
-        test_obj.teardown_iommu_test()
+        cleanup_actions = []
+        if migration_obj:
+            cleanup_actions.append(("cleanup_connection", migration_obj.cleanup_connection))
+        if remote_virsh:
+            cleanup_actions.append(("close_remote_virsh", remote_virsh.close_session))
+        cleanup_actions.append(("teardown_iommu_test", test_obj.teardown_iommu_test))
+
+        for action_name, action in cleanup_actions:
+            try:
+                action()
+            except Exception as err:
+                test.log.warning("Cleanup step failed (%s): %s", action_name, err)

Based on learnings: In tp-libvirt tests, prefer using a list of cleanup lambda functions appended when cleanup is needed, then execute them in finally with individual error handling.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
finally:
migration_obj.cleanup_connection()
if migration_obj:
migration_obj.cleanup_connection()
if remote_virsh:
remote_virsh.close_session()
test_obj.teardown_iommu_test()
finally:
cleanup_actions = []
if migration_obj:
cleanup_actions.append(("cleanup_connection", migration_obj.cleanup_connection))
if remote_virsh:
cleanup_actions.append(("close_remote_virsh", remote_virsh.close_session))
cleanup_actions.append(("teardown_iommu_test", test_obj.teardown_iommu_test))
for action_name, action in cleanup_actions:
try:
action()
except Exception as err:
test.log.warning("Cleanup step failed (%s): %s", action_name, err)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libvirt/tests/src/sriov/vIOMMU/migration_iommu_device.py` around lines 105 -
110, The finally block should not call migration_obj.cleanup_connection(),
remote_virsh.close_session(), and test_obj.teardown_iommu_test() directly
because one exception can prevent subsequent cleanup; instead, create a list
(e.g., cleanup_handlers) and append lambdas or callables for
migration_obj.cleanup_connection, remote_virsh.close_session, and
test_obj.teardown_iommu_test at the points where those resources are allocated,
then in the finally iterate over cleanup_handlers and call each inside its own
try/except to log any exceptions without aborting the remaining handlers so all
three (migration_obj.cleanup_connection(), remote_virsh.close_session(),
test_obj.teardown_iommu_test()) always get attempted.

Comment on lines +395 to +411
if self.dest_backup_xml:
vm_name = self.params.get("migrate_main_vm")
self.test.log.info(f"Restoring VM '{vm_name}' on destination host {self.server_ip}")
remote.scp_to_remote(
self.server_ip,
"22",
self.server_user,
self.server_pwd,
self.dest_backup_xml.xml,
self.dest_backup_xml.xml,
limit="",
log_filename=None,
timeout=60,
interface=None,
)
if not self.dest_backup_xml.define(virsh_instance=self.remote_virsh):
self.test.log.warning(f"Failed to restore VM '{vm_name}' on destination host {self.server_ip}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the file
git ls-files | grep -E "provider/migration/base_steps\.py$"

Repository: autotest/tp-libvirt

Length of output: 95


🏁 Script executed:

# Read the relevant section of the file around lines 395-415
sed -n '390,420p' provider/migration/base_steps.py | cat -n

Repository: autotest/tp-libvirt

Length of output: 1430


🏁 Script executed:

# Get more context to understand the full cleanup flow
sed -n '380,430p' provider/migration/base_steps.py | cat -n

Repository: autotest/tp-libvirt

Length of output: 2476


🏁 Script executed:

# Find the method definition containing this code
grep -n "def cleanup" provider/migration/base_steps.py | head -20

Repository: autotest/tp-libvirt

Length of output: 177


🏁 Script executed:

# Get the full method definition starting from the cleanup method
sed -n '368,435p' provider/migration/base_steps.py | cat -n

Repository: autotest/tp-libvirt

Length of output: 3467


🏁 Script executed:

# Check if this is within a try/except context - look at broader scope
sed -n '360,450p' provider/migration/base_steps.py | cat -n

Repository: autotest/tp-libvirt

Length of output: 4373


🏁 Script executed:

# Search for calls to cleanup_default
rg -n "cleanup_default|cleanup_connection" provider/migration/base_steps.py

Repository: autotest/tp-libvirt

Length of output: 170


🏁 Script executed:

# Check where this class is used and how cleanup is invoked
rg -n "from.*base_steps|import.*base_steps" --type python | head -20

Repository: autotest/tp-libvirt

Length of output: 93


🏁 Script executed:

# Look for the exception handling pattern - check if cleanup is in finally block
rg -B5 -A5 "cleanup_default|cleanup_connection" --type python | head -100

Repository: autotest/tp-libvirt

Length of output: 93


🏁 Script executed:

# Check the exception handling behavior of scp_to_remote
rg -n "def scp_to_remote" --type python

Repository: autotest/tp-libvirt

Length of output: 93


🏁 Script executed:

# Check the define method to see if it raises exceptions
rg -n "def define" provider/migration/base_steps.py

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Search for scp_to_remote definition
find . -name "*.py" -type f | xargs grep -l "def scp_to_remote" 2>/dev/null | head -5

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Search for exception handling related to remote.scp_to_remote
grep -r "scp_to_remote" --include="*.py" | grep -E "(except|try|raise)" | head -10

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Look at the imports in base_steps to see where remote comes from
head -50 provider/migration/base_steps.py | grep -E "^import|^from"

Repository: autotest/tp-libvirt

Length of output: 772


🏁 Script executed:

# Check how the test framework handles exceptions in cleanup
grep -r "cleanup_default\|cleanup_connection" --include="*.py" | grep -B2 -A2 "except\|finally" | head -30

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's look at the actual usage context - find test files that call this
find . -path "*/tests/*" -name "*.py" -type f | xargs grep -l "cleanup_default\|cleanup_connection" 2>/dev/null | head -5

Repository: autotest/tp-libvirt

Length of output: 399


🏁 Script executed:

# Find where scp_to_remote is defined
find . -name "*.py" -type f -exec grep -l "def scp_to_remote" {} \;

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Check the remote module
grep -n "def scp_to_remote" virttest/remote.py 2>/dev/null | head -5

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's look at one of the test files to see how cleanup is called
cat -n ./libvirt/tests/src/migration/abort_precopy_migration/abort_by_domjobabort_on_target.py | grep -A10 "cleanup_default\|cleanup_connection" | head -30

Repository: autotest/tp-libvirt

Length of output: 112


🏁 Script executed:

# Check if remote module has exception handling documentation
grep -A20 "def scp_to_remote" virttest/remote.py 2>/dev/null || grep -A20 "def scp_to_remote" virttest/utils_remote.py 2>/dev/null || echo "Not found in standard locations"

Repository: autotest/tp-libvirt

Length of output: 94


🏁 Script executed:

# Look at the define method - check if it's from libvirt_vmxml
grep -n "class.*VMXml\|def define" virttest/utils_libvirt/libvirt_vmxml.py 2>/dev/null | head -20

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Check how sync() is used and if it's critical
grep -n "orig_config_xml.sync\|\.sync()" provider/migration/base_steps.py

Repository: autotest/tp-libvirt

Length of output: 125


🏁 Script executed:

# Find all Python files in virttest that might contain scp_to_remote
find virttest -name "*.py" -type f | xargs grep -l "scp_to_remote" 2>/dev/null

Repository: autotest/tp-libvirt

Length of output: 108


🏁 Script executed:

# Check the exact implementation
find . -name "remote.py" -type f -path "*/virttest/*"

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Look for the actual definition in the entire codebase
find . -name "*.py" -type f | xargs grep -n "def scp_to_remote" 2>/dev/null

Repository: autotest/tp-libvirt

Length of output: 47


🏁 Script executed:

# Check if exceptions are documented anywhere for remote operations
grep -r "scp_to_remote\|ssh_cmd" --include="*.py" libvirt/tests/src/migration | grep -E "except|Exception|raise" | head -20

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Let's look at the actual method signature and return value handling
sed -n '395,415p' provider/migration/base_steps.py

Repository: autotest/tp-libvirt

Length of output: 906


🏁 Script executed:

# Check the project structure and where virttest might be
ls -la | head -20

Repository: autotest/tp-libvirt

Length of output: 1283


🏁 Script executed:

# Look for setup.py or requirements to understand dependencies
find . -maxdepth 2 -name "setup.py" -o -name "requirements.txt" -o -name "*.cfg" | head -10

Repository: autotest/tp-libvirt

Length of output: 81


🏁 Script executed:

# Search in the entire repository for where remote.scp_to_remote is defined
grep -r "def scp_to_remote" . 2>/dev/null | head -5

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Check if it's imported from an external package - look at what remote module is
python3 << 'EOF'
import sys
sys.path.insert(0, '/root/autotest')
try:
    from virttest import remote
    import inspect
    if hasattr(remote, 'scp_to_remote'):
        print("Found scp_to_remote")
        print(inspect.getsourcefile(remote.scp_to_remote))
except Exception as e:
    print(f"Error: {e}")
EOF

Repository: autotest/tp-libvirt

Length of output: 96


🏁 Script executed:

# Let's check the broader cleanup pattern - look at how other cleanup methods handle exceptions
grep -B5 -A10 "def cleanup" provider/migration/base_steps.py | head -60

Repository: autotest/tp-libvirt

Length of output: 1689


🏁 Script executed:

# Check if this is actually a problem - look for similar patterns in the codebase where sync() must be called
grep -B10 "\.sync()" provider/migration/base_steps.py | head -40

Repository: autotest/tp-libvirt

Length of output: 995


🏁 Script executed:

# Check if cleanup_connection or cleanup_default are wrapped in try/except
grep -B10 -A15 "cleanup_connection\|def cleanup_connection" provider/migration/base_steps.py | head -80

Repository: autotest/tp-libvirt

Length of output: 1139


🏁 Script executed:

# Look at test files that use this to see if there's exception handling
head -200 ./libvirt/tests/src/migration/abort_precopy_migration/abort_by_domjobabort_on_target.py

Repository: autotest/tp-libvirt

Length of output: 751


🏁 Script executed:

# Check if the cleanup is in a finally block
grep -B20 "cleanup_connection\|cleanup_default" ./libvirt/tests/src/migration/abort_precopy_migration/abort_by_domjobabort_on_target.py | tail -30

Repository: autotest/tp-libvirt

Length of output: 669


🏁 Script executed:

# Look at the requirements to understand the remote module source
cat requirements.txt

Repository: autotest/tp-libvirt

Length of output: 137


🏁 Script executed:

# Search for exception handling patterns with scp_to_remote across test files
grep -r "scp_to_remote" --include="*.py" libvirt/tests/src/migration | head -20

Repository: autotest/tp-libvirt

Length of output: 870


🏁 Script executed:

# Look for more usages of scp_to_remote to see error handling patterns
grep -B5 -A10 "scp_to_remote" libvirt/tests/src/migration/migrate_network.py | head -50

Repository: autotest/tp-libvirt

Length of output: 1658


🏁 Script executed:

# Check if there are exception handlers around scp_to_remote calls elsewhere
grep -r "scp_to_remote" --include="*.py" -B3 -A3 | grep -E "try|except|raise" | head -20

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Look at the migration_base module to understand cleanup patterns
find . -name "migration_base.py" -type f | xargs cat | head -100

Repository: autotest/tp-libvirt

Length of output: 4476


🏁 Script executed:

# Check the critical sync() call - what does it do and why is it important?
grep -n "sync" provider/migration/base_steps.py

Repository: autotest/tp-libvirt

Length of output: 159


🏁 Script executed:

# Look for any issues or PRs discussing backup XML restoration
git log --all --oneline provider/migration/base_steps.py | head -20

Repository: autotest/tp-libvirt

Length of output: 1285


🏁 Script executed:

# Check if virttest is a separate package that's a dependency
grep -r "virttest" requirements.txt deps/ 2>/dev/null

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Look at one more usage to see if there's a pattern for exception handling
grep -B10 -A15 "scp_to_remote" libvirt/tests/src/migration/migrate_ceph.py

Repository: autotest/tp-libvirt

Length of output: 1571


🏁 Script executed:

# Check the define method return value handling elsewhere
grep -B3 -A5 "\.define(" provider/migration/base_steps.py | head -60

Repository: autotest/tp-libvirt

Length of output: 419


🏁 Script executed:

# Look at similar cleanup patterns to understand best practices
grep -B5 -A15 "\.sync()" provider/migration/base_steps.py

Repository: autotest/tp-libvirt

Length of output: 1660


🏁 Script executed:

# Check if there are any recent fixes or issues related to cleanup
git log --all --oneline | grep -i "cleanup\|sync\|backup" | head -20

Repository: autotest/tp-libvirt

Length of output: 1398


🏁 Script executed:

# Check if define() method can raise exceptions or just returns boolean
grep -r "def define" --include="*.py" | grep -E "vm_xml|VMXML" | head -5

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Look at how define is used elsewhere to understand its behavior
grep -B2 -A5 "\.define(virsh_instance" provider/ --include="*.py" | head -60

Repository: autotest/tp-libvirt

Length of output: 94


🏁 Script executed:

# Check the structure: is this cleanup_default really in a finally block?
grep -B50 "def cleanup_default" provider/migration/base_steps.py | tail -60

Repository: autotest/tp-libvirt

Length of output: 2391


🏁 Script executed:

# Check if there are any TODOs or comments about this cleanup issue
grep -i "todo\|fixme\|cleanup\|exception" provider/migration/base_steps.py | head -20

Repository: autotest/tp-libvirt

Length of output: 620


🏁 Script executed:

# Look at what scp_to_remote actually returns
grep -r "scp_to_remote" --include="*.py" -A2 | grep -E "return|Exception|raise" | head -30

Repository: autotest/tp-libvirt

Length of output: 273


🏁 Script executed:

# Check if scp_to_remote is documented anywhere or if errors are handled
find . -name "*.py" -type f | xargs grep -l "scp_to_remote" | xargs grep -B10 "scp_to_remote" | grep -E "try|except|return" | head -40

Repository: autotest/tp-libvirt

Length of output: 331


🏁 Script executed:

# Look at the broader context - is cleanup_default ever called outside of cleanup_connection?
grep -r "cleanup_default\|\.cleanup_default()" --include="*.py"

Repository: autotest/tp-libvirt

Length of output: 854


🏁 Script executed:

# Check if there's a pattern of try/finally in cleanup paths
grep -B20 "self.orig_config_xml.sync()" provider/migration/base_steps.py | grep -E "try|except|finally"

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Check if sync() being skipped would actually cause test failures
git log --all -p provider/migration/base_steps.py | grep -B10 -A10 "orig_config_xml.sync" | head -50

Repository: autotest/tp-libvirt

Length of output: 1948


🏁 Script executed:

# Look at cleanup_default being called from test files with error handlers
grep -B10 -A10 "cleanup_default()" libvirt/tests/src/migration/migration_cmd/setmaxdowntime_and_getmaxdowntime.py

Repository: autotest/tp-libvirt

Length of output: 316


🏁 Script executed:

# Check if there's a specific pattern in other test files for exception handling
grep -B20 -A5 "cleanup_default\|cleanup_connection" libvirt/tests/src/guest_os_booting/migration/migration_boot.py | head -80

Repository: autotest/tp-libvirt

Length of output: 748


🏁 Script executed:

# Now let me understand - is sync() critical for cleanup? Let's check what it does
grep -r "def sync" --include="*.py" | grep -i "xml\|config" | head -10

Repository: autotest/tp-libvirt

Length of output: 45


🏁 Script executed:

# Check if define() can raise or if it only returns boolean
grep -A10 "if not.*\.define" provider/migration/base_steps.py

Repository: autotest/tp-libvirt

Length of output: 454


🏁 Script executed:

# Verify: is the issue real by checking if exceptions can happen in cleanup flow
grep -r "except.*Exception" provider/migration/ --include="*.py" | head -10

Repository: autotest/tp-libvirt

Length of output: 198


Wrap destination-VM restore in exception handling to guarantee cleanup completion.

The scp_to_remote() call (line 398) and define() call (line 410) can raise exceptions that would prevent self.orig_config_xml.sync() (line 413) from executing, leaving the source host configuration out of sync.

Suggested fix
         if self.dest_backup_xml:
             vm_name = self.params.get("migrate_main_vm")
             self.test.log.info(f"Restoring VM '{vm_name}' on destination host {self.server_ip}")
-            remote.scp_to_remote(
-                self.server_ip,
-                "22",
-                self.server_user,
-                self.server_pwd,
-                self.dest_backup_xml.xml,
-                self.dest_backup_xml.xml,
-                limit="",
-                log_filename=None,
-                timeout=60,
-                interface=None,
-            )
-            if not self.dest_backup_xml.define(virsh_instance=self.remote_virsh):
-                self.test.log.warning(f"Failed to restore VM '{vm_name}' on destination host {self.server_ip}")
+            try:
+                remote.scp_to_remote(
+                    self.server_ip,
+                    "22",
+                    self.server_user,
+                    self.server_pwd,
+                    self.dest_backup_xml.xml,
+                    self.dest_backup_xml.xml,
+                    limit="",
+                    log_filename=None,
+                    timeout=60,
+                    interface=None,
+                )
+                ret = self.dest_backup_xml.define(
+                    virsh_instance=self.remote_virsh, ignore_status=True
+                )
+                if ret.exit_status:
+                    self.test.log.warning(
+                        f"Failed to restore VM '{vm_name}' on destination host {self.server_ip}: {ret.stderr_text}"
+                    )
+            except Exception as err:
+                self.test.log.warning(
+                    f"Failed to restore VM '{vm_name}' on destination host {self.server_ip}: {err}"
+                )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@provider/migration/base_steps.py` around lines 395 - 411, The restore block
using self.dest_backup_xml (calls to remote.scp_to_remote and
self.dest_backup_xml.define) can raise exceptions and prevent the subsequent
self.orig_config_xml.sync() from running; wrap the entire destination-VM restore
sequence (the scp_to_remote call and the define call on self.dest_backup_xml
using virsh_instance=self.remote_virsh) in a try/except/finally so any
exceptions are caught and logged via self.test.log (or re-raised after cleanup)
and ensure self.orig_config_xml.sync() is executed in the finally block to
guarantee source config sync; reference the symbols dest_backup_xml,
remote.scp_to_remote, dest_backup_xml.define, remote_virsh, and
orig_config_xml.sync when locating the code to change.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
provider/migration/base_steps.py (1)

114-122: ⚠️ Potential issue | 🟡 Minor

The remote version check is improved, but undefine result is still unchecked.

The version check now correctly uses a remote session via is_libvirt_feature_supported(..., session=remote_session). However, the undefine call at line 122 does not check the return value. If the undefine fails (e.g., due to permission issues or the VM being in an unexpected state), the error is silently ignored, potentially causing the migration to fail later with a confusing error about the domain already existing.

🛡️ Proposed fix to check undefine result
-        self.remote_virsh.undefine(vm_name, options=undef_opts)
+        result = self.remote_virsh.undefine(vm_name, options=undef_opts)
+        if result.exit_status:
+            self.test.fail(f"Failed to undefine VM '{vm_name}' on destination host "
+                           f"{self.server_ip}: {result.stderr_text}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@provider/migration/base_steps.py` around lines 114 - 122, The call to
self.remote_virsh.undefine(vm_name, options=undef_opts) is not checking the
result; capture its return value and handle failures explicitly: call result =
self.remote_virsh.undefine(vm_name, options=undef_opts), then if result
indicates failure (e.g. False, non-zero, or an error payload), log or raise a
clear exception including vm_name and undef_opts so the caller knows undefine
failed (use the same remote_session/context for any error details), ensuring the
migration stops instead of silently continuing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@provider/migration/base_steps.py`:
- Around line 114-122: The call to self.remote_virsh.undefine(vm_name,
options=undef_opts) is not checking the result; capture its return value and
handle failures explicitly: call result = self.remote_virsh.undefine(vm_name,
options=undef_opts), then if result indicates failure (e.g. False, non-zero, or
an error payload), log or raise a clear exception including vm_name and
undef_opts so the caller knows undefine failed (use the same
remote_session/context for any error details), ensuring the migration stops
instead of silently continuing.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b4c9fc and b49d0cf.

📒 Files selected for processing (1)
  • provider/migration/base_steps.py

Add optional destination VM backup/cleanup functionality to MigrationBase:
- New parameter 'do_destination_vm_backup' (default: no) to enable feature
- Accept optional remote_virsh session for destination host operations
- Check if VM exists on destination before migration
- Fail test if destination VM is running (blocker condition)
- Backup and undefine pre-existing stopped VMs to avoid conflicts
- Use comprehensive undefine options (--nvram, --managed-save, etc.)
- Restore destination VM in cleanup_default()

Update migration_iommu_device test:
- Add setup_remote_virsh_session() helper function
- Create single remote virsh session shared across test operations
- Refactor check_iommu_xml() to reuse remote session (no duplication)
- Add cleanup_default() call in finally block
- Properly close remote session in finally block

This allows migration tests to run on non-clean environments without
depending on destination host state.

AI assisted code and commit. Human reviewed.

Signed-off-by: hholoubk <hholoubk@redhat.com>

Refactor MigrationBase to cache destination host credentials and reuse them across methods, and extract destination VM backup/undefine flow into a helper method.

When do_destination_vm_backup is enabled, include destination host info in logs, fail with clearer message if the destination VM is running, and restore a backed-up destination VM by copying its XML file to the destination host before defining it via remote virsh.

When destination VM backup is enabled, return early if the VM does not exist on the destination host, and enable debug logging for the existence/running-state checks.

Only call destination VM backup/undefine helper when do_destination_vm_backup is enabled, and drop the redundant guard from the helper.

AI assisted code and commit. Human reviewed.

Signed-off-by: hholoubk <hholoubk@redhat.com>

migration: Fix remote_virsh session resource leak in migration_iommu_device

Move remote_virsh session creation and migration_obj initialization
inside the try block to prevent resource leaks if MigrationBase.__init__
fails. Also remove redundant cleanup_default() call since it's already
called within cleanup_connection().

Changes:
- Initialize remote_virsh and migration_obj to None before try block
- Move session creation inside try block
- Add null checks before cleanup operations
- Remove duplicate cleanup_default() call

AI assisted code and commit. Human reviewed.

Signed-off-by: hholoubk <hholoubk@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>

migration: Check destination host libvirt version for checkpoints-metadata

Replace local version_compare() call with is_libvirt_feature_supported()
using a remote SSH session to the destination host. The previous code
incorrectly checked the local libvirt version when deciding whether to
pass --checkpoints-metadata to undefine on the remote host.

AI assisted code and commit. Human reviewed.

Signed-off-by: hholoubk <hholoubk@redhat.com>
Made-with: Cursor

dd
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@provider/migration/base_steps.py`:
- Around line 114-122: The call to self.remote_virsh.undefine(vm_name,
options=undef_opts) is not checked, so failures there can cause later confusing
errors; after the libvirt version check and remote_session.close() return,
capture and verify the result of self.remote_virsh.undefine(vm_name,
options=undef_opts) (or catch exceptions it may raise), and if it indicates
failure log a clear error including vm_name and undef_opts and raise/return an
explicit exception to abort the migration; update the code path around
remote_session, libvirt_version.is_libvirt_feature_supported, and
self.remote_virsh.undefine to perform this check and handle errors
deterministically.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 13d2acc6-0c1a-4217-85ab-25e2fea3d727

📥 Commits

Reviewing files that changed from the base of the PR and between b49d0cf and 78586b7.

📒 Files selected for processing (1)
  • provider/migration/base_steps.py

Comment on lines +114 to +122
try:
checkpoints_params = {"func_supported_since_libvirt_ver": "(7, 5, 0)"}
if libvirt_version.is_libvirt_feature_supported(
checkpoints_params, ignore_error=True, session=remote_session):
undef_opts += " --checkpoints-metadata"
finally:
remote_session.close()

self.remote_virsh.undefine(vm_name, options=undef_opts)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Good improvement: remote version checking is now correctly handled.

The use of is_libvirt_feature_supported with the session parameter correctly queries the remote host's libvirt version. However, the undefine result is not checked — if it fails, the migration will fail later with a confusing error about a domain already existing.

🛡️ Proposed fix to verify undefine result
-        self.remote_virsh.undefine(vm_name, options=undef_opts)
+        result = self.remote_virsh.undefine(vm_name, options=undef_opts)
+        if result.exit_status:
+            self.test.fail(f"Failed to undefine VM '{vm_name}' on destination host "
+                           f"{self.server_ip}: {result.stderr_text}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@provider/migration/base_steps.py` around lines 114 - 122, The call to
self.remote_virsh.undefine(vm_name, options=undef_opts) is not checked, so
failures there can cause later confusing errors; after the libvirt version check
and remote_session.close() return, capture and verify the result of
self.remote_virsh.undefine(vm_name, options=undef_opts) (or catch exceptions it
may raise), and if it indicates failure log a clear error including vm_name and
undef_opts and raise/return an explicit exception to abort the migration; update
the code path around remote_session,
libvirt_version.is_libvirt_feature_supported, and self.remote_virsh.undefine to
perform this check and handle errors deterministically.

@hholoubk hholoubk force-pushed the migration_dest_vm_cleanup_refactor branch from 78586b7 to 43cf889 Compare March 10, 2026 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant