Skip to content

MTV-4150 | Fix disk transfer completed when importer pod is pending#4286

Open
gwencasey96 wants to merge 1 commit into
kubev2v:mainfrom
gwencasey96:disk-transfer-completed-importer-pod-pendingr
Open

MTV-4150 | Fix disk transfer completed when importer pod is pending#4286
gwencasey96 wants to merge 1 commit into
kubev2v:mainfrom
gwencasey96:disk-transfer-completed-importer-pod-pendingr

Conversation

@gwencasey96

Copy link
Copy Markdown
Contributor

Resolves: MTV-4150

The Disk Transfer step was being marked as completed when a DataVolume reached Succeeded phase, even if the PVCs were not bound or the importer pod was still pending. This could happen when CSI drivers fail to provision volumes (e.g., nfs.csi.k8s.io issues).

This fix adds validation before marking a DataVolume as completed:

  • Verifies the main PVC is bound
  • Verifies the prime PVC (if it exists) is bound
  • Verifies the importer pod (if it exists) is not pending

If any of these checks fail, the task is marked as pending instead of completed, preventing Disk Transfer from showing as completed prematurely.

Also adds a default case to handle unknown/empty DataVolume phases.

@gwencasey96 gwencasey96 force-pushed the disk-transfer-completed-importer-pod-pendingr branch 3 times, most recently from 61122cb to e4c04d7 Compare January 12, 2026 15:55
@gwencasey96 gwencasey96 changed the title Fix disk transfer completed when importer pod is pending MTV-4150 | Fix disk transfer completed when importer pod is pending Jan 12, 2026
@codecov-commenter

codecov-commenter commented Jan 12, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 45.87156% with 59 lines in your changes missing coverage. Please review.
✅ Project coverage is 11.09%. Comparing base (f1fe5d0) to head (8cb1f1c).
⚠️ Report is 2729 commits behind head on main.

Files with missing lines Patch % Lines
pkg/controller/plan/migration.go 45.87% 54 Missing and 5 partials ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4286      +/-   ##
==========================================
- Coverage   15.45%   11.09%   -4.37%     
==========================================
  Files         112      523     +411     
  Lines       23377    62973   +39596     
==========================================
+ Hits         3613     6984    +3371     
- Misses      19479    55390   +35911     
- Partials      285      599     +314     
Flag Coverage Δ
unittests 11.09% <45.87%> (-4.37%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mnecas mnecas left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hard to read, please change the logic so there are not that many nested conditions. ref

Please take a look at CDI, maybe it does not update the DataVolume status correctly.

@mrnold

mrnold commented Jan 14, 2026

Copy link
Copy Markdown
Contributor

Where does the DataVolume get set to Succeeded? From the bug, it looks like the phase is just blank:

# oc get dv
NAME                              PHASE   PROGRESS   RESTARTS   AGE
warm-win2019-test-vm-1056-x2zzw           N/A                   23m

The patch is kind of hard to read, but I am more wondering if it fixes the wrong symptom. If you have a controller log or something that shows the DataVolume itself is Succeeded can you attach it to the Jira ticket?

@sonarqubecloud

Copy link
Copy Markdown

@gwencasey96 gwencasey96 force-pushed the disk-transfer-completed-importer-pod-pendingr branch 2 times, most recently from d61e92e to 060e621 Compare June 23, 2026 04:52
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds verifyDataVolumeCompletion to migration.go to gate CDI DataVolume task completion by checking main PVC binding, prime PVC binding, and importer pod phase. Wires the helper into updateCopyProgress for the Succeeded phase, restricts cold-conversion PendingPopulation promotion to non-warm migrations, and adds a default branch for unknown phases. Covers all new logic with table-driven unit tests.

Changes

DataVolume Completion Gating

Layer / File(s) Summary
verifyDataVolumeCompletion helper and updateCopyProgress wiring
pkg/controller/plan/migration.go
Adds verifyDataVolumeCompletion that checks main PVC is Bound, optionally checks the prime PVC (prime-<uid>) is Bound, and blocks completion when an importer pod is Pending. Wires the helper into updateCopyProgress for cdi.Succeeded, replacing the prior unconditional completion. Restricts PendingPopulationSucceeded promotion to non-warm cold-conversion migrations. Adds a default branch that marks unknown/empty phases as Pending with a logged reason.
Unit tests for completion gating and warm migration
pkg/controller/plan/migration_test.go
Adds newTestMigration helper seeding a fake client with runtime objects. Adds TestVerifyDataVolumeCompletion (table-driven, covering unbound PVC, unbound prime PVC, pending importer pod, and running pod with all PVCs bound). Adds TestPendingPopulationNotPromotedOnWarmMigration asserting warm migrations keep phase as cdi.PendingPopulation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • yaacov
  • mrnold
  • solenoci
  • Hazanel

Poem

🐇 Hoppity-hop, the DataVolume waits,
Till PVC binds and the importer pod sates.
No Succeeded too soon, no phantom complete—
The prime PVC checks make the gating concrete.
Warm migrations stay warm, unknown phases stay known,
A careful white rabbit tends every stone! 🌸

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically identifies the main change: fixing the disk transfer step completion logic when the importer pod is pending.
Description check ✅ Passed The description is well-related to the changeset, explaining the bug being fixed and the validation checks being added to prevent premature completion.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
pkg/controller/plan/migration.go (1)

1740-1743: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Avoid the path.Join-based identity key in these new log lines.

The new log statements throughout verifyDataVolumeCompletion (e.g., Lines 1742, 1756, 1766, 1775, 1792, 1807, 1819, 1837, 1848) use "dv", path.Join(dv.Namespace, dv.Name). Prefer separate key/value pairs for resource identity, e.g. "dv.name", dv.Name, "dv.namespace", dv.Namespace.

Based on learnings: log resource name and namespace as separate key/value pairs rather than combining them into a single identity string key using path.Join, using dot-prefixed keys (e.g. dv.name, dv.namespace) to distinguish resource types; avoid introducing new log lines that use the older path.Join-based identity pattern.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/plan/migration.go` around lines 1740 - 1743, Throughout the
verifyDataVolumeCompletion function, replace all instances of the pattern "dv",
path.Join(dv.Namespace, dv.Name) with separate key/value pairs for better log
structure. Instead use "dv.name", dv.Name, "dv.namespace", dv.Namespace as
consecutive key/value pairs in each log statement. This change should be applied
to all log lines in verifyDataVolumeCompletion that currently use the
path.Join-based identity pattern (appearing at approximately lines 1742, 1756,
1766, 1775, 1792, 1807, 1819, 1837, and 1848).

Source: Learnings

pkg/controller/plan/migration_test.go (2)

456-465: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

This test re-implements the production condition instead of exercising it.

Lines 459-461 copy the exact gating expression from updateCopyProgress rather than calling the real code path. The assertion can only ever pass and will not catch a regression if the production condition at Line 1896 of migration.go changes. Drive the actual logic (e.g., invoke updateCopyProgress with a PendingPopulation DataVolume on a warm plan and assert the resulting task stays pending) so the test guards real behavior.

As per path instructions: tests should provide coverage on critical logic paths with specific, meaningful assertions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/plan/migration_test.go` around lines 456 - 465, The test is
duplicating the production gating condition from the updateCopyProgress method
instead of invoking it. Replace the manual condition check (lines 459-461 that
set dvPhase) by directly calling the actual updateCopyProgress method with a
Migration configured with a warm plan and a PendingPopulation DataVolume, then
assert that the phase remains in the expected state to verify the real code path
behavior rather than re-implementing its logic.

Source: Path instructions


293-431: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Bind Gomega to each subtest and consider covering the remaining branches.

g := gomega.NewGomegaWithT(t) at Line 293 captures the parent t; failures inside the t.Run closures are attributed to the parent rather than the failing subtest. Use g := gomega.NewWithT(t) inside the closure (Line 409). Also consider table cases for the currently-uncovered branches of verifyDataVolumeCompletion: empty ClaimName, prime-PVC NotFound (completion allowed), and the "no importer pod found" path.

As per path instructions: assertions must provide informative messages on failure, and tests should strive for high coverage on critical logic paths.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/plan/migration_test.go` around lines 293 - 431, The Gomega
instance `g` initialized at line 293 captures the parent test `t`, causing
assertion failures in subtests to be misattributed to the parent test rather
than the failing subtest. Inside the subtest closure (the t.Run lambda around
line 409), create a new Gomega instance using `gomega.NewWithT(t)` bound to the
current subtest instead of using the parent-scoped instance. Additionally, add
new test cases to the existing test table to cover the currently uncovered
branches of `verifyDataVolumeCompletion`: a case with an empty `ClaimName`, a
case where the prime-PVC is not found but completion is allowed, and a case for
the missing importer pod scenario.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/controller/plan/migration.go`:
- Around line 1784-1822: The error handling for transient failures is
inconsistent between the main PVC check (which returns false to block
completion) and the importer pod check via r.kubevirt.GetImporterPod(*primePVC)
(which returns true to allow completion). When GetImporterPod encounters a
transient error (kErr is not nil), change the return statement to return false
instead of true, so the reconciliation will retry on the next cycle rather than
prematurely marking the task as complete, maintaining consistency with the main
PVC error handling strategy.
- Around line 2046-2058: The default case in the phase switch statement is
incorrectly reporting cdi.PendingPopulation as an unknown or empty phase. Add an
explicit case for cdi.PendingPopulation before the default case that increments
pending, sets task.Phase to api.StepPending, and sets task.Reason to an accurate
message indicating that the DataVolume is in PendingPopulation phase (e.g.,
"DataVolume is in PendingPopulation phase"). Keep the existing default case for
handling truly unknown or empty phases with the current misleading message only
for those cases.
- Around line 1896-1917: The issue is that promoting PendingPopulation to
cdi.Succeeded for cold-conversion migrations creates a deadlock because
verifyDataVolumeCompletion immediately checks if the PVC is bound, which fails
with WaitForFirstConsumer storage classes. To fix this, track the original
DataVolume phase before the promotion at line 1896, then either skip
verifyDataVolumeCompletion for the promoted case or modify the verification
logic in verifyDataVolumeCompletion to relax the bound-check requirement when
the original phase was PendingPopulation, allowing the conversion pod to be
scheduled and trigger PVC binding.

---

Nitpick comments:
In `@pkg/controller/plan/migration_test.go`:
- Around line 456-465: The test is duplicating the production gating condition
from the updateCopyProgress method instead of invoking it. Replace the manual
condition check (lines 459-461 that set dvPhase) by directly calling the actual
updateCopyProgress method with a Migration configured with a warm plan and a
PendingPopulation DataVolume, then assert that the phase remains in the expected
state to verify the real code path behavior rather than re-implementing its
logic.
- Around line 293-431: The Gomega instance `g` initialized at line 293 captures
the parent test `t`, causing assertion failures in subtests to be misattributed
to the parent test rather than the failing subtest. Inside the subtest closure
(the t.Run lambda around line 409), create a new Gomega instance using
`gomega.NewWithT(t)` bound to the current subtest instead of using the
parent-scoped instance. Additionally, add new test cases to the existing test
table to cover the currently uncovered branches of `verifyDataVolumeCompletion`:
a case with an empty `ClaimName`, a case where the prime-PVC is not found but
completion is allowed, and a case for the missing importer pod scenario.

In `@pkg/controller/plan/migration.go`:
- Around line 1740-1743: Throughout the verifyDataVolumeCompletion function,
replace all instances of the pattern "dv", path.Join(dv.Namespace, dv.Name) with
separate key/value pairs for better log structure. Instead use "dv.name",
dv.Name, "dv.namespace", dv.Namespace as consecutive key/value pairs in each log
statement. This change should be applied to all log lines in
verifyDataVolumeCompletion that currently use the path.Join-based identity
pattern (appearing at approximately lines 1742, 1756, 1766, 1775, 1792, 1807,
1819, 1837, and 1848).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4515a35d-7d0b-40fb-a9b6-d80cf05eac21

📥 Commits

Reviewing files that changed from the base of the PR and between 61bdd24 and 060e621.

📒 Files selected for processing (2)
  • pkg/controller/plan/migration.go
  • pkg/controller/plan/migration_test.go

Comment thread pkg/controller/plan/migration.go Outdated
Comment on lines +1784 to +1822
err = r.Destination.Client.Get(context.TODO(), types.NamespacedName{
Namespace: r.Plan.Spec.TargetNamespace,
Name: fmt.Sprintf("prime-%s", pvc.UID),
}, primePVC)
if err != nil {
if !k8serr.IsNotFound(err) {
r.Log.Error(err, "Could not get prime PVC for DataVolume to verify completion",
"vm", vm.String(),
"dv", path.Join(dv.Namespace, dv.Name),
"dvPhase", dv.Status.Phase)
return false, fmt.Sprintf("Could not get prime PVC: %v", err)
}
// Prime PVC doesn't exist - that's okay, importer may not be using it
r.Log.V(4).Info("Prime PVC does not exist, skipping prime PVC check",
"vm", vm.String(),
"dv", path.Join(dv.Namespace, dv.Name))
return true, ""
}

// Prime PVC exists - check if it's bound
if primePVC.Status.Phase != core.ClaimBound {
r.Log.Info("DataVolume reports Succeeded but prime PVC is not bound",
"vm", vm.String(),
"dv", path.Join(dv.Namespace, dv.Name),
"dvPhase", dv.Status.Phase,
"primePVCPhase", primePVC.Status.Phase,
"podPhaseAnnotation", podPhaseAnnotation)
return false, fmt.Sprintf("Prime PVC is not bound (phase: %s)", primePVC.Status.Phase)
}

// Check if importer pod exists and is not pending
importer, found, kErr := r.kubevirt.GetImporterPod(*primePVC)
if kErr != nil {
r.Log.Error(kErr, "Could not get CDI importer pod for DataVolume to verify completion",
"vm", vm.String(),
"dv", path.Join(dv.Namespace, dv.Name),
"dvPhase", dv.Status.Phase)
// Don't block completion if we can't check the pod - this is a transient issue
return true, ""

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Transient-error handling is asymmetric and biases toward premature completion.

A transient error fetching the main PVC blocks completion (Line 1759), which is safe, but transient errors for the prime PVC (other than NotFound) and the importer pod return true (Lines 1822), allowing the task to be marked complete despite the readiness check failing to run. Since the whole point of this gate is to avoid premature completion, consider returning false (stay pending and retry on the next reconcile) on these transient failures as well.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/plan/migration.go` around lines 1784 - 1822, The error
handling for transient failures is inconsistent between the main PVC check
(which returns false to block completion) and the importer pod check via
r.kubevirt.GetImporterPod(*primePVC) (which returns true to allow completion).
When GetImporterPod encounters a transient error (kErr is not nil), change the
return statement to return false instead of true, so the reconciliation will
retry on the next cycle rather than prematurely marking the task as complete,
maintaining consistency with the main PVC error handling strategy.

Comment on lines +1896 to +1917
if dv.Status.Phase == cdi.PendingPopulation && r.Source.Provider.RequiresConversion() && !r.Plan.IsWarm() {
// Cold conversion migrations: the conversion pod is the first consumer of
// the PVCs, so PendingPopulation can be treated as Succeeded. Warm migrations
// still require the importer to run — do not promote the phase here.
dv.Status.Phase = cdi.Succeeded
}
conditions := dv.Conditions()
switch dv.Status.Phase {
case cdi.Succeeded:
completed++
r.setTaskCompleted(task)
canMarkCompleted, reason := r.verifyDataVolumeCompletion(dv.DataVolume, vm)
if canMarkCompleted {
completed++
r.setTaskCompleted(task)
} else {
pending++
task.Phase = api.StepPending
if reason != "" {
task.Reason = fmt.Sprintf("DataVolume shows %s but verification failed: %s", dv.Status.Phase, reason)
} else {
task.Reason = fmt.Sprintf("DataVolume shows %s but PVC is not bound or importer pod is pending", dv.Status.Phase)
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major

🧩 Analysis chain

🌐 Web query:

With CDI populators and a WaitForFirstConsumer storage class, is the target PVC bound while the DataVolume phase is PendingPopulation, or does it remain Pending until the first consumer pod is scheduled?

💡 Result:

When using Containerized Data Importer (CDI) with a storage class configured for WaitForFirstConsumer (WFFC) volume binding mode, the target PersistentVolumeClaim (PVC) remains in a Pending state—waiting for a consumer pod to be scheduled—until a pod is created that references the PVC [1][2][3]. Specifically, the DataVolume (DV) phase behavior is as follows: 1. Binding Delay: When the storage class has volumeBindingMode: WaitForFirstConsumer, CDI will not schedule its internal worker pods (used for import, clone, or upload) until the PVC is bound [1][4]. The PVC will remain Pending until a workload (e.g., a KubeVirt Virtual Machine) is scheduled, which triggers the actual binding of the PVC [1][5]. 2. DataVolume Phase: If CDI is using the newer populator mechanism, and the storage class is set to WaitForFirstConsumer without a pre-selected node or immediate binding request, the DataVolume will enter the PendingPopulation phase [6][7]. In this state, the PVC remains unbound (in a Pending state) specifically because the volume binding is delayed by the WFFC storage class policy [7]. 3. Workload Interaction: To move the PVC out of this Pending state, the workload controller (such as KubeVirt) must detect the PendingPopulation (or WaitForFirstConsumer) phase and trigger the creation of a consumer pod (often an ephemeral dummy pod) to satisfy the scheduling requirements [5][8]. Once the PVC is bound to a node, the actual population process completes, and the DataVolume phase transitions accordingly [6]. In summary, the PVC remains unbound (Pending) until the first consumer pod is scheduled [1][3]. The PendingPopulation phase on the DataVolume indicates that CDI is effectively paused or awaiting this binding to proceed with population [6][7].

Citations:


🏁 Script executed:

# First, let's check the file size and examine verifyDataVolumeCompletion function around Line 1772
cd /tmp && find . -name "migration.go" -path "*/controller/plan/*" 2>/dev/null | head -1 | xargs wc -l

Repository: kubev2v/forklift

Length of output: 156


🏁 Script executed:

# Find and examine the verifyDataVolumeCompletion function
rg -A 50 "func.*verifyDataVolumeCompletion" pkg/controller/plan/migration.go

Repository: kubev2v/forklift

Length of output: 2070


🏁 Script executed:

# Also check the exact area around Line 1772 to see the PVC bound check
sed -n '1750,1800p' pkg/controller/plan/migration.go | cat -n

Repository: kubev2v/forklift

Length of output: 2425


🏁 Script executed:

# Check around Line 1896 for the cold conversion context
sed -n '1875,1920p' pkg/controller/plan/migration.go | cat -n

Repository: kubev2v/forklift

Length of output: 2061


Promoted PendingPopulation creates a deadlock in cold-conversion migrations with WaitForFirstConsumer storage classes.

At line 1896, the code unconditionally promotes PendingPopulation DataVolumes to cdi.Succeeded for cold-conversion migrations. However, verifyDataVolumeCompletion (which runs immediately after) enforces that the main PVC must be Bound before marking the task complete. With WaitForFirstConsumer storage classes, the PVC intentionally remains Pending until a consumer pod is scheduled—which is exactly what the conversion pod should trigger. By promoting PendingPopulation to Succeeded while the PVC stays Pending, the verification check fails, the task reverts to Pending, and the conversion pod is never created, causing the migration to stall.

Fix: Either skip verifyDataVolumeCompletion for the promoted PendingPopulation case by tracking the original phase, or relax the bound-check requirement when the original DataVolume phase was PendingPopulation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/plan/migration.go` around lines 1896 - 1917, The issue is that
promoting PendingPopulation to cdi.Succeeded for cold-conversion migrations
creates a deadlock because verifyDataVolumeCompletion immediately checks if the
PVC is bound, which fails with WaitForFirstConsumer storage classes. To fix
this, track the original DataVolume phase before the promotion at line 1896,
then either skip verifyDataVolumeCompletion for the promoted case or modify the
verification logic in verifyDataVolumeCompletion to relax the bound-check
requirement when the original phase was PendingPopulation, allowing the
conversion pod to be scheduled and trigger PVC binding.

Comment on lines +2046 to +2058
default:
// Handle unknown/empty phases - treat as pending
log.Info(
"DataVolume has unknown or empty phase, treating as pending.",
"vm",
vm.String(),
"dv",
path.Join(dv.Namespace, dv.Name),
"phase",
dv.Status.Phase)
pending++
task.Phase = api.StepPending
task.Reason = fmt.Sprintf("DataVolume phase is unknown or empty: %s", dv.Status.Phase)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

default reason is misleading for known phases like PendingPopulation.

With warm migrations no longer promoting PendingPopulation (Line 1896) and no explicit cdi.PendingPopulation switch case, that phase now falls into this default branch and reports "DataVolume phase is unknown or empty: PendingPopulation". Treating it as pending is correct, but the message wrongly implies an unknown/empty phase. Consider an explicit cdi.PendingPopulation case with an accurate reason.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/plan/migration.go` around lines 2046 - 2058, The default case
in the phase switch statement is incorrectly reporting cdi.PendingPopulation as
an unknown or empty phase. Add an explicit case for cdi.PendingPopulation before
the default case that increments pending, sets task.Phase to api.StepPending,
and sets task.Reason to an accurate message indicating that the DataVolume is in
PendingPopulation phase (e.g., "DataVolume is in PendingPopulation phase"). Keep
the existing default case for handling truly unknown or empty phases with the
current misleading message only for those cases.

Verify PVC binding and importer pod state before marking disk transfer
complete. Do not promote PendingPopulation to Succeeded on warm
migrations. Restore separate Paused checkpoint handling.

Ref: https://redhat.atlassian.net/browse/MTV-4150
Resolves: MTV-4150
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Gwen Casey <gcasey@redhat.com>
@gwencasey96 gwencasey96 force-pushed the disk-transfer-completed-importer-pod-pendingr branch from 060e621 to 8cb1f1c Compare June 23, 2026 05:01
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants