feat: support v2 cloning (full-copy and linked-clone mode)#391
Conversation
|
This pull request is now in conflict. Could you fix it @PhanLe1010? 🙏 |
fcd5cd2 to
f0de4ed
Compare
09520b9 to
aa1f7e8
Compare
derekbit
left a comment
There was a problem hiding this comment.
I will continue reviewing the PR tomorrow.
flowchart TD
Start(["Start: Engine.SnapshotClone"])
A["Engine: select dst replica (must be 1 RW)"]
B["Engine: find src replica candidates (RW replicas)"]
C{"cloneMode == linked-clone?"}
D1["Check if there is a src candidate with the same IP/LvsUUID as dst"]
D2["If not found and cloneMode==linked -> return error"]
E_link_req["Engine -> dst: ReplicaSnapshotCloneDstStart(... , linked)"]
F_link_dst_calls_src["dst -> src: ReplicaSnapshotCloneSrcStart(..., linked)"]
G_link_set_parent["src: set dst parent -> point to src snapshot"]
H_link_finish["dst: SnapshotCloneDstFinish (linked) => Done"]
E_full_req["Engine -> dst: ReplicaSnapshotCloneDstStart(... , full-copy)"]
I_create_cloning_lvol["dst: create cloning lvol & expose (allocate port)"]
J_dst_call_src["dst -> src: ReplicaSnapshotCloneSrcStart(..., full-copy)"]
K_src_start_deepcopy["src: start deep-copy, return op/status"]
L_dst_set_inprogress["dst: set status InProgress, start monitor goroutine"]
M_monitor["monitor goroutine: periodically call src.ReplicaSnapshotCloneSrcStatusCheck"]
N{"status == COMPLETE ?"}
O_finish_src_try["monitor: best-effort call ReplicaSnapshotCloneSrcFinish"]
P_dst_finalize["dst: SnapshotCloneDstFinish -> create tmp snapshot, set parent, cleanup resources"]
Q_Done(["Done"])
R_Error(["Error branch -> set status Error -> cleanup & return"])
Start --> A --> B --> C
C -- yes --> D1
D1 -- found --> E_link_req --> F_link_dst_calls_src --> G_link_set_parent --> H_link_finish --> Q_Done
D1 -- not found --> D2 --> R_Error
C -- no (full-copy) --> E_full_req --> I_create_cloning_lvol --> J_dst_call_src --> K_src_start_deepcopy --> L_dst_set_inprogress --> M_monitor
M_monitor --> N
N -- no --> M_monitor
N -- yes --> O_finish_src_try --> P_dst_finalize --> Q_Done
%% error flows
K_src_start_deepcopy -- fail --> R_Error
J_dst_call_src -- fail --> R_Error
I_create_cloning_lvol -- fail --> R_Error
M_monitor -- timeout/error --> R_Error
O_finish_src_try -- fail (best-effort) --> P_dst_finalize
|
derekbit
left a comment
There was a problem hiding this comment.
One question
If a snapshot volume is cloning, can user delete the snapshot or the replica?
For linked-clone, If the src snapshot disappear before calling For full-clone, it uses |
9413f9e to
f680640
Compare
| return fmt.Errorf("there are already another linked-clone lvol %v in src replica %v. "+ | ||
| "Each src replica can only has 1 linked-clone lvol at a time", childLvolName, r.Name) | ||
| } | ||
| } |
There was a problem hiding this comment.
Why do we need the restriction?
There was a problem hiding this comment.
Linked-clone volume is supposed to be a short-live volume for backup solution to read data out and delete it once backup complete. Therefore:
- Each src volume does not need to have multiple backups at the same time -> 1 linked-clone is enough
- Multiple linked-clone volumes is problematic when user want to delete the source volume or delete snapshot of source volume because SPDK cannot delete snapshot will multiple children (the multiple linked-clone volumes)
| return nil, grpcstatus.Errorf(grpccodes.NotFound, "cannot find replica %s during ReplicaSnapshotCloneDstStart", req.Name) | ||
| } | ||
|
|
||
| if err := r.SnapshotCloneDstStart(spdkClient, req.SnapshotName, req.SrcReplicaName, req.SrcReplicaAddress, req.CloneMode); err != nil { |
There was a problem hiding this comment.
Should we pass ctx to downstream calls? Same question for other newly introduced methods.
There was a problem hiding this comment.
I think we should. All current grpc methods do not pass ctx down. Should we create ticket to refactor all of them?
There was a problem hiding this comment.
Yes, could you help open an issue to track this?
There was a problem hiding this comment.
It seems that other gRPC servers (mainly in longhorn-instance-manager) do not handle the input ctx either. Besides, there is one more context s.ctx.
| } | ||
| // Create cloning lvol and expose it | ||
| cloningLvolName := GetReplicaCloningLvolName(r.Name) | ||
| if _, err = spdkClient.BdevLvolCreate("", r.LvsUUID, cloningLvolName, util.BytesToMiB(r.SpecSize), |
There was a problem hiding this comment.
Will the resources and allocated port get cleaned up if any of the subsequent downstream calls fail?
There was a problem hiding this comment.
Replica will be marked as failed, then stopped. The resource will be cleanup as part of stopping logic
| existingParentOfDstReplica = lvolName | ||
| continue |
There was a problem hiding this comment.
Why not break if existingParentOfDstReplica is found?
There was a problem hiding this comment.
Because we want to verify if there there are already another linked-clone lvol in src replica:
if !IsReplicaLvol(r.Name, childLvolName) {
return fmt.Errorf("there are already another linked-clone lvol %v in src replica %v. "+
"Each src replica can only has 1 linked-clone lvol at a time", childLvolName, r.Name)
}
c3y1huang
left a comment
There was a problem hiding this comment.
lgtm. Remaining TODOs:
- Update go.mod
- Clean up commit history
- Resolve conflicts
35f6be8 to
c513051
Compare
|
All done:
|
longhorn-7794 Signed-off-by: Phan Le <phan.le@suse.com>
c513051 to
c599a52
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #391 +/- ##
========================================
- Coverage 0.77% 0.70% -0.07%
========================================
Files 24 24
Lines 9866 10737 +871
========================================
Hits 76 76
- Misses 9783 10654 +871
Partials 7 7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
shuo-wu
left a comment
There was a problem hiding this comment.
BTW, have we created a ticket that adds checks for volume naming?
| return nil, grpcstatus.Errorf(grpccodes.NotFound, "cannot find replica %s during ReplicaSnapshotCloneDstStart", req.Name) | ||
| } | ||
|
|
||
| if err := r.SnapshotCloneDstStart(spdkClient, req.SnapshotName, req.SrcReplicaName, req.SrcReplicaAddress, req.CloneMode); err != nil { |
There was a problem hiding this comment.
It seems that other gRPC servers (mainly in longhorn-instance-manager) do not handle the input ctx either. Besides, there is one more context s.ctx.
| if err := doCleanupForSnapshotCloneSrc(spdkClient, c); err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
What if the clone in SPDK is not done yet? Should we print out any warning or error logs now? Will we have an interrupt function in the future?
There was a problem hiding this comment.
I expect there there will be error logs when calling disconnectNVMfBdev
| if errClose := srcReplicaCli.Close(); errClose != nil { | ||
| r.log.WithError(errClose).Errorf("Failed to close src client for %s after status check", srcReplicaName) | ||
| } |
There was a problem hiding this comment.
Can we retain srcReplicaCli before closing this for loop? Frequently creating and closing the connection is not a good implementation...
There was a problem hiding this comment.
I think we should recreat. If there is networking error, retry using the old client is useless
There was a problem hiding this comment.
The TCP TIME_WAIT delay is the period, typically 240 seconds (4 minutes), a TCP connection stays in the TIME_WAIT state after being closed by the client to ensure the reliable termination of the connection by allowing the receiving end to send final acknowledgements.
Based on this info (provided by AI), by default, we will have 80 TIME_WAIT TCP connections after a full clone start. Is that good?
Created longhorn/longhorn#11739 |
longhorn/longhorn#7794