nodes: strip launcher-managed mgmt IP from saved vrnetlab configs#3263
Open
alxrxs wants to merge 2 commits into
Open
nodes: strip launcher-managed mgmt IP from saved vrnetlab configs#3263alxrxs wants to merge 2 commits into
alxrxs wants to merge 2 commits into
Conversation
d6b03ee to
9699f59
Compare
vrnetlab-backed (VRNode) kinds configure their management interface from
the container's actual clab-assigned IP on every boot. SaveConfig
persists the full running-config verbatim, including that address; on
the next deploy the saved config is reused as-is (default, non-
--reconfigure behavior), pinning a now-stale mgmt IP. If the container
is later assigned a different mgmt address (partial --node-filter
deploy, IPAM handing out a different IP, etc.), the node applies both
the launcher's current address and the stale saved one -- whichever the
device's config-apply order favors wins. When it's the stale one, the
node comes up fully functional but unreachable at the address clab/DNS
expect: "healthy" per the container healthcheck, but not reachable.
Add FilterMgmtIPConfigLines, wired into VRNode.SaveConfig, gated by a
verified per-platform allowlist so unverified kinds are left untouched.
It's structure-aware rather than a blind line-delete: a leaf statement
(IOS(-XR) "ip(v4) address x", Junos "address x/y;") only drops that
line; a line that opens a hierarchy block (Junos "address x/y { ... }")
has the whole balanced block removed so braces stay consistent. Matches
are also gated on the literal word "address" so unrelated references to
the mgmt IP (snmp-server host, ntp server, a BGP neighbor) are never
touched, and lines carrying "description" are excluded too, so an
interface description that happens to contain both the mgmt IP and the
word "address" is left alone.
vr_ros already carried a bespoke version of this fix
(filterManagementInterfaceConfig, RouterOS-only) -- refactored onto the
shared implementation.
Same fix as the previous commit, generalized: vr_ros already stripped its management interface's IP out of saved configs (filterManagementInterfaceConfig), just for RouterOS specifically. Drop the bespoke version in favor of the shared FilterMgmtIPConfigLines.
9699f59 to
3f8f3e8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
VRNode.SaveConfig(used by every vrnetlab-backed kind that doesn't override it) saves the full running-config verbatim vianetconf.GetConfig, including whatever the device currently has configured on its management interface. The vrnetlab launcher itself also configures that same interface, every boot, from the container's actual clab-assigned IP -- that's the real source of truth for management addressing, not anything in the saved/startup config.On a redeploy, the saved config is reused as-is by default (
!EnforceStartupConfig && FileExists(dst)inGenerateConfig). If the mgmt IP saved in it is stale -- most commonly because a subsequent deploy assigned the container a different mgmt IP (a partial--node-filterdeploy, Docker IPAM handing out a different address, etc.) -- the node ends up with two competing mgmt-interface configurations applied during the same boot: the launcher's current one and the stale saved one. Depending on the platform's config-apply order, the stale one can win. The node comes up completely healthy (container healthcheck passes, the NOS itself is fine) but unreachable at the IP clab/DNS actually expect.This is exactly the failure I hit and root-caused on a containerlab lab running the XRd vRouter kind:
containerlab savehad captured a running-config withMgmtEth0/RP0/CPU0/0pinned to one IP; a later redeploy assigned the container a different IP; XRd came uphealthybut nothing answered ARP for the address clab reported.What this PR does
Adds
FilterMgmtIPConfigLines(nodes/vr_node.go) and calls it fromVRNode.SaveConfigbefore the config is written to disk, so newly saved configs never carry a mgmt address that can compete with the launcher's.It's deliberately not a blind "delete any line mentioning the mgmt IP":
172.20.20.2doesn't also match172.20.20.20) and look like an address assignment (guarded on the literal word "address", case-insensitive) are touched -- unrelated references likesnmp-server host <mgmt-ip>orntp server <mgmt-ip>are left alone. Lines carryingdescriptionare excluded too, so an interface description that happens to contain both the mgmt IP and the word "address" is not stripped (a real address line never contains the word "description").ip(v4) address x, Junosaddress x/y;, RouterOSadd address=x/y ...) only has that one line removed, leaving the enclosing interface/context header intact. A line that opens a hierarchy block (Junosaddress x/y { primary; }) has the whole balanced block removed instead, so braces stay consistent -- a naive per-line delete would otherwise leave Junos config with unbalanced braces.It's gated by a small platform allowlist (
mgmtConfigFilterPlatforms) rather than applied universally. Stripping the saved mgmt address is only safe for kinds that re-establish their mgmt address on deploy independently of the saved config -- i.e. a vrnetlab launcher that configures the mgmt interface itself from the container's IP every boot. That's exactly the set fixed launcher-side in the companion srl-labs/vrnetlab PR:cisco_asa,cisco_iosxe,cisco_iosxr,cisco_nxos,juniper_junos.Deliberately excluded even though their address syntax would match: kinds whose launcher applies the startup-config verbatim with no mgmt stanza of its own -- notably
cisco_ios(vios) and thedell_emc/ftosv OS10 launcher -- because for those, stripping the saved mgmt address would leave a reused full-config with no mgmt address at all (the regression this change exists to prevent, inverted). Also left out until their launcher's mgmt behavior is verified:arista_eos,huawei_vrp,ipinfusion_ocnos, andaruba_aoscx(whoseip staticsyntax the "address" guard wouldn't match anyway). Any platform not in the allowlist is untouched (pure no-op), so this can't regress a kind nobody has verified.vr_rosalready had its own version of this exact fix (filterManagementInterfaceConfig, scoped to RouterOS'sether1) -- refactored onto the shared implementation in the second commit.Testing
Unit tests in
nodes/vr_node_mgmtfilter_test.gocover: IOS-XR/IOS/NX-OS leaf removal, Junos leaf removal, Junos block removal (the case that would break under a naive per-line delete), RouterOS set-style, the172.20.20.2vs172.20.20.20/172.20.20.200boundary, the address-keyword guard (mgmt IP appears in a non-address line and is correctly left alone), IPv6, dual-stack, and the no-mgmt-IP-set no-op case.I couldn't run the full
nodespackage test suite in my environment (pre-existing, unrelated build failures on this checkout: a Linux-only transitive dependency and an unrelatedgopsutilAPI mismatch) but the new logic is fully covered in isolation andgofmt-clean.