feat: support v2 cloning (full-copy and linked-clone mode) by PhanLe1010 · Pull Request #391 · longhorn/longhorn-spdk-engine

PhanLe1010 · 2025-08-13T00:52:44Z

longhorn/longhorn#7794

mergify · 2025-08-17T15:15:55Z

This pull request is now in conflict. Could you fix it @PhanLe1010? 🙏

derekbit

I will continue reviewing the PR tomorrow.

derekbit · 2025-08-26T08:47:18Z

cc @davidcheng0922 @c3y1huang @shuo-wu

derekbit · 2025-08-26T13:15:17Z

flowchart TD
  Start(["Start: Engine.SnapshotClone"])
  A["Engine: select dst replica (must be 1 RW)"]
  B["Engine: find src replica candidates (RW replicas)"]
  C{"cloneMode == linked-clone?"}
  D1["Check if there is a src candidate with the same IP/LvsUUID as dst"]
  D2["If not found and cloneMode==linked -> return error"]
  E_link_req["Engine -> dst: ReplicaSnapshotCloneDstStart(... , linked)"]
  F_link_dst_calls_src["dst -> src: ReplicaSnapshotCloneSrcStart(..., linked)"]
  G_link_set_parent["src: set dst parent -> point to src snapshot"]
  H_link_finish["dst: SnapshotCloneDstFinish (linked) => Done"]

  E_full_req["Engine -> dst: ReplicaSnapshotCloneDstStart(... , full-copy)"]
  I_create_cloning_lvol["dst: create cloning lvol & expose (allocate port)"]
  J_dst_call_src["dst -> src: ReplicaSnapshotCloneSrcStart(..., full-copy)"]
  K_src_start_deepcopy["src: start deep-copy, return op/status"]
  L_dst_set_inprogress["dst: set status InProgress, start monitor goroutine"]
  M_monitor["monitor goroutine: periodically call src.ReplicaSnapshotCloneSrcStatusCheck"]
  N{"status == COMPLETE ?"}
  O_finish_src_try["monitor: best-effort call ReplicaSnapshotCloneSrcFinish"]
  P_dst_finalize["dst: SnapshotCloneDstFinish -> create tmp snapshot, set parent, cleanup resources"]
  Q_Done(["Done"])
  R_Error(["Error branch -> set status Error -> cleanup & return"])

  Start --> A --> B --> C
  C -- yes --> D1
  D1 -- found --> E_link_req --> F_link_dst_calls_src --> G_link_set_parent --> H_link_finish --> Q_Done
  D1 -- not found --> D2 --> R_Error

  C -- no (full-copy) --> E_full_req --> I_create_cloning_lvol --> J_dst_call_src --> K_src_start_deepcopy --> L_dst_set_inprogress --> M_monitor
  M_monitor --> N
  N -- no --> M_monitor
  N -- yes --> O_finish_src_try --> P_dst_finalize --> Q_Done

  %% error flows
  K_src_start_deepcopy -- fail --> R_Error
  J_dst_call_src -- fail --> R_Error
  I_create_cloning_lvol -- fail --> R_Error
  M_monitor -- timeout/error --> R_Error
  O_finish_src_try -- fail (best-effort) --> P_dst_finalize

Engine:
- Engine.SnapshotClone(snapshotName, srcEngineName, srcEngineAddress, cloneMode):
  - Lock engine; require destination engine to have exactly 1 RW replica (current implementation).
  - Choose dst RW replica; fetch src engine replica list and make candidate map (IP, LvsUUID, address).
  - Prefer src candidate that matches dst IP & LvsUUID for linked-clone.
  - If linked requested but no same-pool candidate found, error.
  - Invoke dstReplica.ReplicaSnapshotCloneDstStart(..., cloneMode).
Destination replica flow (dst)
1. Replica.SnapshotCloneDstStart(...) called (via Engine -> dst RPC).
2. Validate params and clear any previous cloning state.
3. linked-clone:
- Validate src/dst same IP/LvsUUID constraints.
- Call src.ReplicaSnapshotCloneSrcStart(..., linked).
- Immediately mark dst clone complete and run SnapshotCloneDstFinish for parent setup.
1. full-copy:
- Allocate a port and create a cloning LVOL on dst, expose it (NVMe target).
- Call src.ReplicaSnapshotCloneSrcStart(dstCloningLvolAddress, full-copy).
- Set dst state to in progress; start a monitor goroutine to poll src status.

monitorSnapshotClone:
- Periodically call src.ReplicaSnapshotCloneSrcStatusCheck.
- Update processed/total clusters and progress.
- On COMPLETE or ERROR (or timeout), best-effort call src.ReplicaSnapshotCloneSrcFinish, then call dst.SnapshotCloneDstFinish to finalize (create tmp snapshot, set head parent, cleanup resources).

Source replica flow (src)
- Replica.SnapshotCloneSrcStart(...):
  - linked-clone: call BdevLvolSetParent for the dst lvol to point to the src snapshot (fast).
  - full-copy: start deep-copy operation on src (SPDK deep-copy op), store op id/status in snapshotCloningSrcCache.
- Replica.SnapshotCloneSrcStatusCheck: return deep-copy progress (processed/total clusters, state, error).
- `Replica.SnapshotCloneSrcFinish : finish/cleanup deep-copy on src.

derekbit

One question
If a snapshot volume is cloning, can user delete the snapshot or the replica?

PhanLe1010 · 2025-08-26T19:46:35Z

One question
If a snapshot volume is cloning, can user delete the snapshot or the replica?

For linked-clone, If the src snapshot disappear before calling bdev_lvol_set_parent, SPDK set parent will fail and the clone is marked as failed

For full-clone, it uses bdev_lvol_start_deep_copy API. I think it is not possible delete the src lvol while the SPDK deep copy API is opening it. WDYT?

derekbit · 2025-08-31T09:50:55Z

+				return fmt.Errorf("there are already another linked-clone lvol %v in src replica %v. "+
+					"Each src replica can only has 1 linked-clone lvol at a time", childLvolName, r.Name)
+			}
+		}


Why do we need the restriction?

Linked-clone volume is supposed to be a short-live volume for backup solution to read data out and delete it once backup complete. Therefore:

Each src volume does not need to have multiple backups at the same time -> 1 linked-clone is enough

Multiple linked-clone volumes is problematic when user want to delete the source volume or delete snapshot of source volume because SPDK cannot delete snapshot will multiple children (the multiple linked-clone volumes)

c3y1huang · 2025-09-01T04:45:53Z

+		return nil, grpcstatus.Errorf(grpccodes.NotFound, "cannot find replica %s during ReplicaSnapshotCloneDstStart", req.Name)
+	}
+
+	if err := r.SnapshotCloneDstStart(spdkClient, req.SnapshotName, req.SrcReplicaName, req.SrcReplicaAddress, req.CloneMode); err != nil {


Should we pass ctx to downstream calls? Same question for other newly introduced methods.

I think we should. All current grpc methods do not pass ctx down. Should we create ticket to refactor all of them?

Yes, could you help open an issue to track this?

It seems that other gRPC servers (mainly in longhorn-instance-manager) do not handle the input ctx either. Besides, there is one more context s.ctx.

c3y1huang · 2025-09-01T05:14:42Z

+	}
+	// Create cloning lvol and expose it
+	cloningLvolName := GetReplicaCloningLvolName(r.Name)
+	if _, err = spdkClient.BdevLvolCreate("", r.LvsUUID, cloningLvolName, util.BytesToMiB(r.SpecSize),


Will the resources and allocated port get cleaned up if any of the subsequent downstream calls fail?

Replica will be marked as failed, then stopped. The resource will be cleanup as part of stopping logic

c3y1huang · 2025-09-01T05:41:12Z

+				existingParentOfDstReplica = lvolName
+				continue


Why not break if existingParentOfDstReplica is found?

Because we want to verify if there there are already another linked-clone lvol in src replica:

if !IsReplicaLvol(r.Name, childLvolName) { return fmt.Errorf("there are already another linked-clone lvol %v in src replica %v. "+ "Each src replica can only has 1 linked-clone lvol at a time", childLvolName, r.Name) }

derekbit

LGTM!

c3y1huang

lgtm. Remaining TODOs:

Update go.mod
Clean up commit history
Resolve conflicts

PhanLe1010 · 2025-09-09T19:45:33Z

@derekbit @c3y1huang

All done:

Commit cleanup
go.mod cleanup
Conflict resolved

longhorn-7794 Signed-off-by: Phan Le <phan.le@suse.com>

codecov · 2025-09-09T20:40:34Z

Codecov Report

❌ Patch coverage is 0% with 880 lines in your changes missing coverage. Please review.
✅ Project coverage is 0.70%. Comparing base (4d9b75a) to head (c599a52).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/spdk/replica.go	0.00%	493 Missing ⚠️
pkg/client/client.go	0.00%	142 Missing ⚠️
pkg/spdk/server.go	0.00%	116 Missing ⚠️
pkg/spdk/engine.go	0.00%	86 Missing ⚠️
pkg/api/types.go	0.00%	25 Missing ⚠️
pkg/util/util.go	0.00%	10 Missing ⚠️
pkg/spdk/types.go	0.00%	8 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##            main    #391      +/-   ##
========================================
- Coverage   0.77%   0.70%   -0.07%     
========================================
  Files         24      24              
  Lines       9866   10737     +871     
========================================
  Hits          76      76              
- Misses      9783   10654     +871     
  Partials       7       7

Flag	Coverage Δ
unittests	`0.70% <0.00%> (-0.07%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

shuo-wu

BTW, have we created a ticket that adds checks for volume naming?

shuo-wu · 2025-09-10T00:00:29Z

+		return nil, grpcstatus.Errorf(grpccodes.NotFound, "cannot find replica %s during ReplicaSnapshotCloneDstStart", req.Name)
+	}
+
+	if err := r.SnapshotCloneDstStart(spdkClient, req.SnapshotName, req.SrcReplicaName, req.SrcReplicaAddress, req.CloneMode); err != nil {


It seems that other gRPC servers (mainly in longhorn-instance-manager) do not handle the input ctx either. Besides, there is one more context s.ctx.

shuo-wu · 2025-09-10T01:07:58Z

+	if err := doCleanupForSnapshotCloneSrc(spdkClient, c); err != nil {
+		return err
+	}


What if the clone in SPDK is not done yet? Should we print out any warning or error logs now? Will we have an interrupt function in the future?

I expect there there will be error logs when calling disconnectNVMfBdev

shuo-wu · 2025-09-10T06:18:44Z

+			if errClose := srcReplicaCli.Close(); errClose != nil {
+				r.log.WithError(errClose).Errorf("Failed to close src client for %s after status check", srcReplicaName)
+			}


Can we retain srcReplicaCli before closing this for loop? Frequently creating and closing the connection is not a good implementation...

I think we should recreat. If there is networking error, retry using the old client is useless

The TCP TIME_WAIT delay is the period, typically 240 seconds (4 minutes), a TCP connection stays in the TIME_WAIT state after being closed by the client to ensure the reliable termination of the connection by allowing the receiving end to send final acknowledgements.

Based on this info (provided by AI), by default, we will have 80 TIME_WAIT TCP connections after a full clone start. Is that good?

PhanLe1010 · 2025-09-11T04:37:30Z

BTW, have we created a ticket that adds checks for volume naming?

Created longhorn/longhorn#11739

PhanLe1010 force-pushed the 7794-cloning branch 2 times, most recently from fcd5cd2 to f0de4ed Compare August 22, 2025 05:20

PhanLe1010 marked this pull request as ready for review August 22, 2025 05:21

PhanLe1010 requested review from c3y1huang, derekbit, innobead and shuo-wu August 22, 2025 05:21

PhanLe1010 changed the title ~~feat: support v2 cloning~~ feat: support v2 cloning (full-copy and linked-clone mode) Aug 22, 2025

PhanLe1010 force-pushed the 7794-cloning branch 2 times, most recently from 09520b9 to aa1f7e8 Compare August 26, 2025 05:24

derekbit reviewed Aug 26, 2025

View reviewed changes

derekbit requested a review from davidcheng0922 August 26, 2025 08:46

derekbit reviewed Aug 26, 2025

View reviewed changes

Comment thread pkg/spdk/replica.go

Comment thread pkg/spdk/replica.go Outdated

shuo-wu reviewed Aug 27, 2025

View reviewed changes

Comment thread pkg/spdk/engine.go

Comment thread pkg/spdk/engine.go

Comment thread pkg/spdk/replica.go

Comment thread pkg/spdk/replica.go

Comment thread pkg/spdk/replica.go

Comment thread pkg/spdk/replica.go

Comment thread pkg/spdk/replica.go

PhanLe1010 force-pushed the 7794-cloning branch from 9413f9e to f680640 Compare August 28, 2025 00:44

derekbit reviewed Aug 31, 2025

View reviewed changes

c3y1huang reviewed Sep 1, 2025

View reviewed changes

Comment thread pkg/spdk/types.go

PhanLe1010 requested review from c3y1huang, derekbit and shuo-wu September 5, 2025 01:30

derekbit previously approved these changes Sep 9, 2025

View reviewed changes

c3y1huang reviewed Sep 9, 2025

View reviewed changes

PhanLe1010 dismissed derekbit’s stale review via c513051 September 9, 2025 19:42

PhanLe1010 force-pushed the 7794-cloning branch from 35f6be8 to c513051 Compare September 9, 2025 19:42

feat: support v2 cloning

c599a52

longhorn-7794 Signed-off-by: Phan Le <phan.le@suse.com>

PhanLe1010 force-pushed the 7794-cloning branch from c513051 to c599a52 Compare September 9, 2025 20:01

PhanLe1010 mentioned this pull request Sep 9, 2025

feat: support v2 cloning (full-copy and linked-clone mode) longhorn/longhorn-instance-manager#1047

Merged

derekbit approved these changes Sep 10, 2025

View reviewed changes

derekbit merged commit d94eb93 into longhorn:main Sep 10, 2025
5 of 9 checks passed

shuo-wu reviewed Sep 10, 2025

View reviewed changes

PhanLe1010 mentioned this pull request Sep 11, 2025

[FEATURE] V2 Volume Supports Cloning longhorn/longhorn#7794

Closed

Conversation

PhanLe1010 commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify Bot commented Aug 17, 2025

Uh oh!

derekbit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

derekbit commented Aug 26, 2025

Uh oh!

derekbit commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

derekbit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

PhanLe1010 commented Aug 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PhanLe1010 Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

derekbit left a comment

Choose a reason for hiding this comment

Uh oh!

c3y1huang left a comment

Choose a reason for hiding this comment

Uh oh!

PhanLe1010 commented Sep 9, 2025

Uh oh!

codecov Bot commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

PhanLe1010 commented Aug 13, 2025 •

edited

Loading

derekbit commented Aug 26, 2025 •

edited

Loading

PhanLe1010 Sep 5, 2025 •

edited

Loading

codecov Bot commented Sep 9, 2025 •

edited

Loading