Skip to content

refactor: v2 linked clone (backport #500)#536

Merged
derekbit merged 14 commits into
v1.12.xfrom
mergify/bp/v1.12.x/pr-500
May 19, 2026
Merged

refactor: v2 linked clone (backport #500)#536
derekbit merged 14 commits into
v1.12.xfrom
mergify/bp/v1.12.x/pr-500

Conversation

@mergify
Copy link
Copy Markdown

@mergify mergify Bot commented May 19, 2026

Which issue(s) this PR fixes:

Issue longhorn/longhorn#12552

What this PR does / why we need it:

Special notes for your reviewer:

We introduced a special kind of empty snapshot lvol called entrypoint lvol for src replicas. Then, all clone replicas use an entrypoint lvol as their ActiveChain[0], which is like a backing image. After that, clone replicas should be both readable and writable like normal replicas.

Additional documentation or context

Notice Claude Opus 4.6 generates the codes based on the following prompt/plan. And the plan is mainly for simple reference only. Because there are several rounds of improvement after that.

First, refactor the existing linked-clone feature for repo `longhorn-spdk-engine` based on the descriptions below:
- The source replica:
  1. will prepare an entrypoint lvol (which is actually an empty snapshot lvol) from the requested snapshot
  2. record all entrypoints and all clone replicas pointing to it. Even for a src replica detected from an existing disk after restart, it should recover these info as well.
  3. avoid or ignore all entrypoints when checking or constructing its snapshot map 
  4. automatically clean up the entrypoint when there is no clone replica pointing to it
  5. forbid the replica deletion when cleanup is true, while there are still clone replicas pointing to it
  6. forbid the snapshot deletion when there is still a valid entrypoint pointing to it
  7. sync func should verify (and probably repair) all entrypoint lvols for the src replica, and linked-clone source information for clone replicas. 
- The clone replica itself:
  1. grows a new head lvol from the entrypoint lvol.
  2. record its entrypoint and the source replica. Even for the replica detected from an existing disk after restart, it should recover these info as well.
  3. sync func should verify linked-clone source information for clone replicas. And for case that the clone replica's root parent somehow becomes `the source snapshot directly`, the sync func should repair it.

```<hr>This is an automatic backport of pull request #500 done by [Mergify](https://mergify.com).

shuo-wu and others added 14 commits May 19, 2026 09:25
Introduce naming conventions for clone entrypoint lvols. Entrypoints
sit between a source replica's snapshot and clone replica heads.

Add constants ReplicaCloneEntrypointLvolInfix and
ReplicaCloneEntrypointTmpHeadSuffix, plus helper functions for
constructing, parsing, and identifying entrypoint lvol names.

Longhorn 12552

Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit 0a4b1a5)
Add CloneEntrypointInfo struct to track entrypoint lvol metadata and
its associated clone replicas. Extend the Replica struct with fields
for both source replicas (cloneEntrypointMap) and clone replicas
(isCloneReplica, cloneSourceReplicaName, cloneEntrypointLvolName).

Longhorn 12552

Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit eb3a749)
Filter out clone entrypoint lvols and their tmp heads from the snapshot
lvol map construction so they do not interfere with the replica's
snapshot chain. Update constructActiveChainFromSnapshotLvolMap comments
to reflect that clone entrypoints are handled externally.

Longhorn 12552

Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit cd0bf95)
…struct

Add recoverCloneEntrypointInfo to scan the bdev lvol map for entrypoint
lvols belonging to this replica, populating cloneEntrypointMap from
SPDK state. Resolve clone replica names via
GetCloneReplicaNameFromEntrypointChildLvol.

Add recoverCloneReplicaInfo to detect linked clones by checking
whether the chain root has a clone entrypoint as parent. Both are
called during construct() for recovery after restart.

Longhorn 12552

Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit 1685b50)
Rewrite snapshotLinkedCloneSrcStart to create clone entrypoint lvols
between the source snapshot and clone heads, enabling multiple linked
clones per snapshot with clean lifecycle management.

Add createCloneEntrypoint (3-step: clone, snapshot, delete leftover).
Record clone source info in SnapshotCloneDstStart for linked clones.

Longhorn 12552

Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit f922713)
Forbid replica deletion (with cleanup) while any clone entrypoint
still has active clone replicas pointing to it. The guard runs early
in Delete() before any SPDK cleanup to avoid side effects on a
rejected delete.

Forbid snapshot deletion while a valid clone entrypoint with active
clone replicas is attached to that snapshot.

Longhorn 12552

Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit 723ff6f)
Add syncCloneEntrypoints called from Sync() to refresh entrypoint
state from SPDK, auto-cleanup entrypoints with no children, discover
new entrypoints, and remove orphaned tmp-head lvols.

Longhorn 12552

Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit ea11d20)
Add syncCloneReplicaInfo to verify the clone replica's chain root
parent matches the expected entrypoint during Sync. Handles three
cases: parent matches (no-op), entrypoint removed and parent points
to the source snapshot (recreate), or lineage corrupted (mark error).

repairCloneEntrypoint delegates to the source replica via gRPC
(CloneSrcStart/SrcFinish) so the source registers the entrypoint in
its cloneEntrypointMap.

Longhorn 12552

Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit ae3b397)
…stry handling

The clone entrypoint serves the same role as a backing image: a
read-only chain base. Place it at ActiveChain[0] to unify chain
semantics.

- Include entrypoint lvols in replicaLvolFilter
- Extend constructActiveChainFromSnapshotLvolMap to load entrypoints
- Guard BackingImage assignment for actual backing images only
- Lock ActiveChain[0] directly in linkHeadWithParent
- Update repairCloneEntrypoint to set ActiveChain[0] after reparenting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit ee39376)
ActiveChain[0] (backing image or clone entrypoint) shared its Children
map across replicas, causing cross-replica mutation when
linkHeadWithParent or removeLvolFromSnapshotLvolMapWithoutLock modified
the map.

- Replace additive Children insertion with full map replacement in
  constructActiveChainFromSnapshotLvolMap and repairCloneEntrypoint,
  keeping only the current replica's root lvol
- Copy BackingImage.Snapshot per-replica in prepareHead with a fresh
  Children map instead of sharing the pointer

Longhorn 12552

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit 852dc75)
Add clone metadata fields to the Replica protobuf message and populate
them in ServiceReplicaToProtoReplica.

New proto fields:
- is_clone_replica (bool)
- clone_source_replica_name (string)
- clone_entrypoint_lvol_name (string)
- clone_entrypoint_map (map<string, int32>)

Longhorn 12552

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit 13eb9fd)
Updates generated gRPC stubs (spdkrpc, imrpc) to include the new
DstReplicaSrcReplicaPairMap field in EngineSnapshotCloneRequest.
Updates go.mod/go.sum to reference the new types pseudo-version.

Longhorn 12552

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit c6233e0)
…rcReplicaPairMap

Extend SnapshotClone to clone all N dst replicas simultaneously for
linked-clone volumes, instead of requiring exactly 1 replica and relying
on N-1 DST rebuilds afterward.

Key design points:

- Add isCloning flag: set for the duration of SnapshotClone (linked-clone
  mode only). Guards ReplicaAdd, snapshot operations, backup, and
  expansion from running concurrently with the clone. ValidateAndUpdate
  is also skipped while cloning to avoid false divergence errors.

- snapshotCloneLinkedN: dispatches one goroutine per dst replica under
  the engine lock. Each goroutine calls SnapshotCloneDst (BdevLvolSetParent
  — a near-instantaneous metadata op) on its paired src replica. Results
  are aggregated; the engine lock is held throughout via wg.Wait().

- DstReplicaSrcReplicaPairMap: the manager (which already ran the
  scheduler) passes an explicit dst→src replica name map so the engine
  does not need to auto-detect co-location. Dst replicas absent from the
  map are marked ModeERR so the manager can schedule a rebuild for them
  later; the clone continues for all paired replicas. The entire operation
  fails only if no dst replica can be paired at all.

- Backward compatibility: an empty DstReplicaSrcReplicaPairMap falls back
  to the original 1-replica deep-copy/linked-clone path so old managers
  continue to work.

Longhorn 12552

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit 9bd9bdb)
…cycle

Integration tests against a real SPDK target:
- Linked-clone volume read/write and independent snapshots
- Source replica/snapshot deletion blocked by active clone entrypoints
- Recovery of entrypoint and clone info after restart
- syncCloneEntrypoints auto-cleanup of childless entrypoints and
  orphaned tmp-head lvols (including correct SPDK state simulation
  via BdevLvolSnapshot to create a read-only ep + undeleted tmp-head)
- Repair of broken clone lineage via syncCloneReplicaInfo
- N-replica simultaneous linked-clone via DstReplicaSrcReplicaPairMap

Unit tests (no SPDK required):
- syncWithBdevLvolMap dispatch to correct sync path
- syncCloneReplicaInfo cases 1/2/3 covering entrypoint lookup logic
- Full Sync() execution paths (clone replica detection, error paths)

Also refactor Sync() to delegate to syncWithBdevLvolMap for testability;
spdkClient=nil skips validateAndUpdate so unit tests can exercise sync
logic without a live SPDK connection.

Longhorn 12552

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Shuo Wu <shuo.wu@suse.com>
(cherry picked from commit 0325a68)
@mergify mergify Bot mentioned this pull request May 19, 2026
@derekbit derekbit merged commit 55a09e4 into v1.12.x May 19, 2026
7 checks passed
@derekbit derekbit deleted the mergify/bp/v1.12.x/pr-500 branch May 19, 2026 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants