coordinator: detect SLAAC v6 on iface when PrevResult.IPs lacks v6#5619
Conversation
PrevResult.IPs is populated only by upstream IPAMs. v6 addresses obtained via kernel SLAAC (RA-driven autoconf) never appear there even though they are configured on the macvlan child, which leaves ipFamily stuck at FAMILY_V4 and silently skips every v6 path in the coordinator: host /128 routes, hijack-route gateway, and sysctl reconciliation. Add GetIPFamilyByResultWithIface that wraps the existing pure parser and falls back to scanning the pod iface (in args.Netns) for non-link-local v6 addresses. The coordinator opens the pod netns up front and uses the new helper, so ipFamily resolves to FAMILY_ALL for the common "DHCPv4 IPAM + kernel SLAAC v6" macvlan setup. CmdAdd logs at Info when the fallback fires so operators can see why the family was upgraded. The pure GetIPFamilyByResult is kept unchanged for callers without a netns. New unit tests cover the parsing paths and every branch of the fallback (iface scan mocked via gomonkey). Fixes spidernet-io#5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
1557a50 to
c34c768
Compare
… getAdders
RFC 4862 §5.4 ("Duplicate Address Detection") says a tentative address is
not considered assigned to an interface, and §5.4.5 says a DAD-failed
address MUST NOT be assigned. Without filtering these in getAdders, the
SLAAC v6 fallback path could install host /128 routes and NUD_PERMANENT
neighbor entries against an address that the kernel removes moments
later — observable in a small window during CNI ADD on busy macvlan
parents (vishvananda/netlink.AddrList does not flag-filter).
While here, also drop loopback (::1) and unspecified (::) — RFC 4291
§2.5.2/§2.5.3 reserved, IANA Source=False/Destination=False. They're
excluded by accident today because all current call sites filter "^lo$",
but adding them to the in-place predicate matches Calico's host-iface
filter and protects future refactors. Zero behavior change on the
documented call paths.
Verified against real Linux DAD on a veth pair (local integration test):
tentative SLAAC v6 is correctly filtered until DAD completes (~1-2s),
then picked up.
Refs spidernet-io#5618
Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
Mirrors the test pattern in cmd/spiderpool/cmd/command_test.go: standard ginkgo _test.go (no build tag), label "unittest" so it's picked up by make unittest-tests, real netns via testutils.NewNS, real interfaces via netlink.LinkAdd. Covers the same scenarios as the mocked unit tests plus one the mocks can't exercise: a SLAAC v6 added without IFA_F_NODAD on a veth pair — verifies the IFA_F_TENTATIVE filter ignores the address until kernel DAD completes, then picks it up (RFC 4862 §5.4). Lives in its own ginkgo suite (pkg/networking/networking/integration) rather than the same _test.go as the mocked unit tests. testutils.NewNS intentionally does not UnlockOSThread (containernetworking/plugins testutils/netns_linux.go:116-118 — Go ≥1.10 kills the locked thread when the goroutine exits), and the resulting OS-thread churn was making gomonkey machine-code patches in sibling specs flaky depending on ginkgo's spec randomization. Separate test binaries fully isolate the two test styles; verified stable across 10 consecutive seeded runs. Refs spidernet-io#5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
The passive iface scan in spidernet-io#5618's first commit only helps setups where the SLAAC v6 is already on the iface at CNI-ADD time. For the common case where it isn't yet — multus chains where macvlan brings the iface up before tuning sets accept_ra=2, combined with the pod inheriting net.ipv4.ip_forward=1 (k3s default), so the kernel skips the link-up RS and ignores any periodic RA at accept_ra=1 (net/ipv6/addrconf.c ipv6_accept_ra: forwarding=1 + accept_ra<2 returns false) — the passive fix is a no-op. Add SolicitRouterAndWaitForSLAACv6: sends an ICMPv6 Router Solicitation to ff02::2 from inside the pod netns and polls AddrList until a usable non-link-local v6 appears (or timeout). Two-phase wait under one timeout budget: 1. Wait for a non-tentative link-local (RFC 4862 §5.4: kernel refuses to source from a tentative address — observable as EADDRNOTAVAIL on sendmsg) — typically <1.5s. 2. Send the RS (RFC 4861 §4.1, §6.3.5) and poll for the SLAAC GUA. CmdAdd now resolves ipFamily in three layers: PrevResult.IPs → passive iface scan → router-solicited active scan. The active path is only exercised when the prior two come up v4-only, so the latency cost is paid only when the fix is the user's last chance for v6 routing. Integration tests cover: empty-input error, missing-iface error, no- router timeout-without-error, and the early-return when a v6 is already present (regression case for setups that don't need the active path). Refs spidernet-io#5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
… state
Address two style-review nits and prevent a v4-only regression.
Gate: SolicitRouterAndWaitForSLAACv6 now reads
/proc/sys/net/ipv6/conf/<iface>/{accept_ra,forwarding} and skips the RS
unless the kernel's ipv6_accept_ra predicate (net/ipv6/addrconf.c) would
process the elicited RA. Before this, v4-only setups paid the full ~3s
solicit timeout on every CmdAdd despite the kernel ignoring the response.
GoDoc on SolicitRouterAndWaitForSLAACv6 trimmed from a 37-line RFC-cited
block to ~10 lines, matching the density of sibling exported helpers in
pkg/networking/networking/packet.go. The longer-form rationale lives in
the PR description and spidernet-io#5618.
integration/doc.go expanded with the gomonkey + testutils.NewNS()
OS-thread-leak rationale so reviewers see why this subpackage exists
(no precedent for `<pkg>/integration` in the repo otherwise).
Refs spidernet-io#5618
Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
…t, test the gate Three small follow-ups from review: - Reword the SolicitRouterAndWaitForSLAACv6 docstring to say the kernel drops (not ignores) the early periodic RAs, and explicitly state the caller contract: accept_ra must be permissive on the iface before the function is invoked (typically via the upstream tuning plugin earlier in the NAD), otherwise the function exits without sending an RS. - Add an integration test that exercises the gate's block path (accept_ra=1 + forwarding=1 → returns before Phase 1's first poll). Wraps sysctl writes in a Skip() guard so the test runs in CI (sudo on a real Linux host) but cleanly skips on contributor laptops where /proc/sys is read-only. Refs spidernet-io#5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
6eeabd6 to
2383ae4
Compare
| logger.Error("failed to GetIPFamilyByResultWithIface", zap.Error(err)) | ||
| return err | ||
| } | ||
| familyDiscovery := "PrevResult.IPs" |
There was a problem hiding this comment.
why do we need this? it's confusing. I think the good way is directly reading the iface's ips in the net namespace, we don't need the two ways.
There was a problem hiding this comment.
Fair — dropped in dee8aaf8e. SolicitRouterAndWaitForSLAACv6, the accept_ra/forwarding gate, defaultSLAACSolicitTimeout, and the related integration tests are all removed. PR is now just the passive iface scan you described.
Brief context on why the active path was here in the first place, in case useful for a follow-up: there's a CNI ADD race for pods that inherit forwarding=1 (k3s default and most other CNIs). Macvlan brings the iface up while accept_ra=1; the kernel's ipv6_accept_ra predicate in net/ipv6/addrconf.c evaluates false when forwarding != 0 && accept_ra < 2, so RAs are silently dropped. Tuning later flips accept_ra=2, but the kernel does not auto-retransmit RS on that change — SLAAC waits for the next periodic RA, which on slow-RA networks can be tens of seconds to several minutes. Coordinator runs ms after tuning, before that next RA, so a passive iface scan finds no v6 for these setups. An explicit RS forces an immediate RA; the kernel then runs SLAAC and the GUA appears within ms.
Happy to revisit either as a separate CNI plugin chained between tuning and coordinator, or as an opt-in field on the coordinator's CNI config — whichever fits the design better. Let me know if you'd like a follow-up issue/PR.
There was a problem hiding this comment.
It seems that in this scenario, we must set accept_ra = 2? Is this a safe sysctl parameter? Would it cause other potential side effects? If it is considered safe, we can have spiderpool-agent set this value to 2 for each node upon startup. This is currently supported—refer to the Spiderpool documentation—and we can try adding this parameter.
Under these circumstances, we should no longer need to actively send RS (Router Solicitation) instructions, right? Of course, I don't mind including that logic, as we have similar mechanisms in the coordinator or IPAM (e.g., actively sending Gratuitous ARP packets to announce IP addresses). For now, we can hold off on introducing this logic if it isn't strictly necessary.
There was a problem hiding this comment.
Tried this; the agent path doesn't actually close the race. Kernel-side reasoning:
Why host-side accept_ra=2 doesn't reach pods. spiderpool-agent runs hostNetwork: true (daemonset.yaml:51); SetSysctl writes /proc/sys/... with no netns.Do (pkg/networking/sysctl/sysctl.go:83). net.ipv6.conf.*.accept_ra is per-netns — each struct net mounts its own table via register_net_sysctl_sz(net, ...) in __addrconf_sysctl_register (net/ipv6/addrconf.c:7224). And when macvlan creates the child in the pod netns, ipv6_add_dev() initializes its cnf from the pod netns's devconf_dflt (addrconf.c:392), not the host's. Adding accept_ra=2 to DefaultSysctlConfig would only flip host-side ifaces.
Why a pod-side flip after link-up also wouldn't close it. accept_ra uses plain proc_dointvec (addrconf.c:6754) — no side-effect hook. ndisc_send_rs() is only invoked from addrconf_dad_completed (link-up DAD) or the rs_timer (only armed if RS was already sent). At link-up, with macvlan's default accept_ra=1 + pod's inherited forwarding=1, ipv6_accept_ra() (include/net/ipv6.h:537) returned false, so neither RS nor rs_timer fired. Tuning flipping accept_ra=2 afterward leaves the iface in steady state — no kernel path resurrects RS. That's the timing race in #5618.
accept_ra=2 itself is safe and idiomatic for K8s workers with forwarding=1 — Talos defaults to it; Moby#43466 advocates the same. Happy to add it to DefaultSysctlConfig as hardening, but I wouldn't frame it as the close for #5618.
For this PR: leave it at passive iface-scan only (already the state). The active RS can come back as an opt-in coordinator field in a follow-up if the race shows up in practice — precedent is the Gratuitous-ARP / Unsolicited-NA paths you mentioned.
There was a problem hiding this comment.
Thank you for the in-depth analysis. I’ve reviewed the kernel source code and conducted tests on my Ubuntu 22.04 machine: the accept_ra sysctl parameter is indeed not inherited by containers; even if the host-side sysctl is modified, it will not take effect inside the container. Therefore, the root cause is now clear.
From a design perspective, the Coordinator is primarily intended to work in tandem with the Spiderpool IPAM. Since the Spiderpool IPAM explicitly allocates an IP from an IPPool to a Pod, it is perfectly valid for the Coordinator to determine the IP family based on preResults. As long as any IPAM populates its allocation results into preResults, the Coordinator can remain compatible with it. I believe this is a standard and universal approach.
Regarding the timing of sending an RS (Router Solicitation): what exactly should trigger it? It seems difficult to distinguish whether an RS should be sent in all cases. In real-world scenarios, IPv6 might not even be enabled. There are various edge cases where this logic should not be executed, and I want to ensure that this implementation has no negative impact on the existing codebase.
Furthermore, I believe the best solution is for IPAM plugins to support returning results, as described in containernetworking/plugins issue #1016. Alternatively, this logic could be written as a custom plugin to be executed prior to the Coordinator.
What are your thoughts on this?
There was a problem hiding this comment.
Agree on the architecture. Upstream consensus is clear: squeed (CNI maintainer) endorsed a 4-step SLAAC-IPAM recipe in containernetworking/plugins#1016 (bring up iface → wait for SLAAC → disable SLAAC, manually add the address → return in CNI status) and offered to merge it in 2024; nobody's written it. KEP-563 (dual-stack) and the dual-stack docs require one v6 per Pod returned at CNI ADD, structurally incompatible with SLAAC's async/renumberable lifecycle. Every major CNI (Calico, Cilium, kube-ovn, flannel, AWS-VPC, PTP) sets accept_ra=0 to enforce this. Coordinator-side active RS isn't where this belongs.
This PR is a narrow correctness fix to family detection, not SLAAC support. The passive iface-scan only fires when PrevResult.IPs lacks v6 and the address is already on the iface by CNI-ADD time — third-party IPAMs that don't fully populate the result, or the subset where SLAAC happened to land before coordinator runs. No-op for the IPAM-driven case. The active RS was already dropped.
The timing race itself belongs in an IPAM plugin per containernetworking/plugins#1016 or in a custom pre-Coordinator plugin, as you noted.
There was a problem hiding this comment.
@kevinpark1217 Sorry for the delay. I think a better approach is to directly read the IP addresses from the interface to determine the IP address family, without relying on the PreResults. This improves code readability and reduces complexity. It may introduce some performance overhead (NetNS operations), but it is applicable to scenarios with different IPAMs.
There was a problem hiding this comment.
Agreed — done in 950448f08. The coordinator now resolves ipFamily via a new GetIPFamilyByIface(netns, ifName) that reads the iface in the pod netns directly; GetIPFamilyByResultWithIface and the parse-then-fallback branches are gone. GetIPFamilyByResult (the pre-existing pure parser) is untouched.
Net diff this commit: +55 / -229 across 4 files. Unit tests for the new helper narrowed to input validation; behavioral coverage runs against real netlink in the integration suite (the gomonkey-based mocks were flaky in isolation).
There was a problem hiding this comment.
Thanks, the code looks good now.
Two fixes rolled together because they touch adjacent code: 1. Revert the IFA_F_TENTATIVE/IFA_F_DADFAILED/IsLoopback/IsUnspecified filters in getAdders. The TENTATIVE filter broke setupNeighborhood's stale-entry cleanup — c.currentAddress now drops the pod's v6 while DAD is in progress, so the pre-populated stale neighbor on the host is never matched and the entry isn't deleted. Observable as e2e MacvlanOverlayOne / "auto clean up the dirty rules" timing out on dual / ipv6 matrix slots. The standards-review reasoning that "tentative is not assigned per RFC 4862 §5.4" is technically correct, but the rest of the coordinator codebase intentionally uses tentative addresses for neighbor / route bookkeeping (the pod is assigned that IP even if DAD hasn't completed). Filtering at the getAdders layer broke that contract. 2. Drop SolicitRouterAndWaitForSLAACv6 + defaultSLAACSolicitTimeout + the active-path call site in CmdAdd per @cyclinder's review on spidernet-io#5619. Coordinator now only does the passive iface scan when PrevResult.IPs lacks v6. Comment spidernet-io#1 (drop the prevResultFamily re-parse and the familyDiscovery tracking) folded in. The active RS remains the right fix for setups hitting the macvlan→tuning accept_ra-flip-after-link-up timing race where the kernel drops the early periodic RAs and won't auto-resend RS. Revisitable as a separate CNI plugin or follow-up PR after this lands. Refs spidernet-io#5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
The previous "v4-only IPAM, no v6 on iface" and "v4 IPAM + only link-local v6" entries had identical inputs (prev=[v4], ifaceV4=[v4], ifaceV6=nil) and the same expected output (FAMILY_V4). The kernel auto-assigns fe80:: on every dummy iface at link-up regardless of what we add, so the "LL filtered" semantic was already covered by the other entry. Collapsed into a single entry with the explanatory description. Refs spidernet-io#5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (15.00%) is below the target coverage (60.00%). You can increase the patch coverage or adjust the target coverage.
Additional details and impacted files@@ Coverage Diff @@
## main #5619 +/- ##
==========================================
- Coverage 66.27% 59.45% -6.82%
==========================================
Files 61 68 +7
Lines 8018 7279 -739
==========================================
- Hits 5314 4328 -986
- Misses 2421 2668 +247
Partials 283 283
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
@cyclinder — review comments addressed (re-parse removed, active RS dropped per your preference). Replied inline on the two threads. The one failing CI check ( |
Yes, it's unrelated. |
|
@cyclinder — any further feedback on this PR? Or is this good to merge as the initial IPv6 family-detection fix? |
Per @cyclinder's review: scanning the iface in the pod netns is the universal way to determine ipFamily — works for any IPAM and for kernel SLAAC alike, simpler than the parse-PrevResult-then-fallback machinery this PR previously added. - Add GetIPFamilyByIface(netns, ifName) in pkg/networking/networking/ip.go. - Remove GetIPFamilyByResultWithIface and the fallback branches it had. - command_add.go calls the new helper directly; PrevResult parsing in CmdAdd is now just a shape-validation pass. - GetIPFamilyByResult (pre-existing pure parser) is untouched for backward compat. Unit tests for GetIPFamilyByIface narrowed to input validation (nil netns, empty ifName) since the behavioral coverage (v4 / v6 / dual / no-addrs / missing-iface) is exercised against real netlink in the integration suite; gomonkey-based behavioral mocking proved flaky. Refs spidernet-io#5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
|
Thanks @cyclinder! What else is left to get this merged? Do I need another review from @weizhoublue or @lou-lan? |
|
@kevinpark1217 Yes, it would be better to have the reviews from other maintainers. /cc @weizhoublue @lou-lan |
|
@kevinpark1217 Thanks for your contribution! |
* bump version to v1.2.0-rc1 (#5583) * robot updates the release version of the README file based on the release tag: v1.1.2 (#5592) Signed-off-by: weizhoublue <weizhou.lan@daocloud.io> Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> * robot updates the release version of the README file based on the release tag: v1.0.6 (#5588) Signed-off-by: weizhoublue <weizhou.lan@daocloud.io> Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> Co-authored-by: Cyclinder <qifeng.guo@daocloud.io> * fix: make Calico installation compatible with latest Calico and older Kubernetes versions (#5610) * Initial plan * fix: replace immutable Calico IPPool CIDR patch with delete+recreate In newer Calico versions (v3.30+), spec.cidr is immutable and cannot be changed via kubectl patch. The CI was failing with: The IPPool "default-ipv4-ippool" is invalid: spec.cidr: Invalid value: "string": CIDR cannot be changed; follow IP pool migration guide to avoid corruption. Replace the kubectl patch approach with a replace_ippool_cidr() helper function that: 1. Exports the current IPPool YAML 2. Strips server-managed fields 3. Updates spec.cidr to the desired value 4. Validates the new YAML with --dry-run=client before destructive ops 5. Deletes the existing pool 6. Recreates it with the updated YAML Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/59cffb22-cbbc-4471-a713-16f925218051 Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> * fix: strip x-kubernetes-validations from Calico CRDs for older K8s compatibility Calico v3.32.0 CRDs use x-kubernetes-validations with: - CEL functions like isCIDR() that require Kubernetes 1.29+ - Fields like 'reason'/'messageExpression' that require K8s 1.28+ These cause strict decoding errors and CEL evaluation failures when applying Calico CRDs on older K8s versions (e.g. K8s v1.27.1) used in the test matrix. Strip all x-kubernetes-validations from the downloaded Calico YAML before applying it. This only removes CRD input validation hints; it does not affect Calico's core networking functionality. Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/d8d45c5c-363b-499a-b09f-0cf90ec8d056 Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> * CI: Fix kdoctor connection issue (#5632) * Stabilize Nightly K8s Matrix E2E cleanup by making IPPool deletion idempotent (#5636) * Initial plan * test(e2e): ignore notfound when deleting ippool by name Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/f9f1436f-70c0-4d93-8f46-ad7753b01d71 Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> * chore: finalize nightly k8s matrix ci fix status Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/f9f1436f-70c0-4d93-8f46-ad7753b01d71 Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> * chore: remove accidental local build artifacts Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/f9f1436f-70c0-4d93-8f46-ad7753b01d71 Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> * Stabilize subnet e2e IPPool record assertions in nightly IPv4 run (#5637) * Initial plan * test(e2e): stabilize subnet ippool record checks with retries Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/89afecd4-2b4f-400f-bcb7-8930fa84ff30 Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> * chore: revert unintended binary build artifacts Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/89afecd4-2b4f-400f-bcb7-8930fa84ff30 Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com> * Stabilize annotation E2E against transient namespace visibility race (#5642) * Initial plan * test(e2e): retry pod creation when namespace cache not ready --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> * Stabilize KubeVirt install in nightly e2e by making virt-operator rollout timeout configurable (#5641) * Initial plan * test: make kubevirt rollout timeout configurable for CI stability * chore: remove unintended build artifacts --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> * coordinator: detect SLAAC v6 on iface when PrevResult.IPs lacks v6 (#5619) * coordinator: detect SLAAC v6 via iface scan when PrevResult.IPs lacks v6 PrevResult.IPs is populated only by upstream IPAMs. v6 addresses obtained via kernel SLAAC (RA-driven autoconf) never appear there even though they are configured on the macvlan child, which leaves ipFamily stuck at FAMILY_V4 and silently skips every v6 path in the coordinator: host /128 routes, hijack-route gateway, and sysctl reconciliation. Add GetIPFamilyByResultWithIface that wraps the existing pure parser and falls back to scanning the pod iface (in args.Netns) for non-link-local v6 addresses. The coordinator opens the pod netns up front and uses the new helper, so ipFamily resolves to FAMILY_ALL for the common "DHCPv4 IPAM + kernel SLAAC v6" macvlan setup. CmdAdd logs at Info when the fallback fires so operators can see why the family was upgraded. The pure GetIPFamilyByResult is kept unchanged for callers without a netns. New unit tests cover the parsing paths and every branch of the fallback (iface scan mocked via gomonkey). Fixes #5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com> * coordinator: filter tentative/DAD-failed and reserved v6 addresses in getAdders RFC 4862 §5.4 ("Duplicate Address Detection") says a tentative address is not considered assigned to an interface, and §5.4.5 says a DAD-failed address MUST NOT be assigned. Without filtering these in getAdders, the SLAAC v6 fallback path could install host /128 routes and NUD_PERMANENT neighbor entries against an address that the kernel removes moments later — observable in a small window during CNI ADD on busy macvlan parents (vishvananda/netlink.AddrList does not flag-filter). While here, also drop loopback (::1) and unspecified (::) — RFC 4291 §2.5.2/§2.5.3 reserved, IANA Source=False/Destination=False. They're excluded by accident today because all current call sites filter "^lo$", but adding them to the in-place predicate matches Calico's host-iface filter and protects future refactors. Zero behavior change on the documented call paths. Verified against real Linux DAD on a veth pair (local integration test): tentative SLAAC v6 is correctly filtered until DAD completes (~1-2s), then picked up. Refs #5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com> * networking: add real-netns integration tests for SLAAC v6 fallback Mirrors the test pattern in cmd/spiderpool/cmd/command_test.go: standard ginkgo _test.go (no build tag), label "unittest" so it's picked up by make unittest-tests, real netns via testutils.NewNS, real interfaces via netlink.LinkAdd. Covers the same scenarios as the mocked unit tests plus one the mocks can't exercise: a SLAAC v6 added without IFA_F_NODAD on a veth pair — verifies the IFA_F_TENTATIVE filter ignores the address until kernel DAD completes, then picks it up (RFC 4862 §5.4). Lives in its own ginkgo suite (pkg/networking/networking/integration) rather than the same _test.go as the mocked unit tests. testutils.NewNS intentionally does not UnlockOSThread (containernetworking/plugins testutils/netns_linux.go:116-118 — Go ≥1.10 kills the locked thread when the goroutine exits), and the resulting OS-thread churn was making gomonkey machine-code patches in sibling specs flaky depending on ginkgo's spec randomization. Separate test binaries fully isolate the two test styles; verified stable across 10 consecutive seeded runs. Refs #5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com> * coordinator: actively solicit an RA when SLAAC v6 hasn't arrived yet The passive iface scan in #5618's first commit only helps setups where the SLAAC v6 is already on the iface at CNI-ADD time. For the common case where it isn't yet — multus chains where macvlan brings the iface up before tuning sets accept_ra=2, combined with the pod inheriting net.ipv4.ip_forward=1 (k3s default), so the kernel skips the link-up RS and ignores any periodic RA at accept_ra=1 (net/ipv6/addrconf.c ipv6_accept_ra: forwarding=1 + accept_ra<2 returns false) — the passive fix is a no-op. Add SolicitRouterAndWaitForSLAACv6: sends an ICMPv6 Router Solicitation to ff02::2 from inside the pod netns and polls AddrList until a usable non-link-local v6 appears (or timeout). Two-phase wait under one timeout budget: 1. Wait for a non-tentative link-local (RFC 4862 §5.4: kernel refuses to source from a tentative address — observable as EADDRNOTAVAIL on sendmsg) — typically <1.5s. 2. Send the RS (RFC 4861 §4.1, §6.3.5) and poll for the SLAAC GUA. CmdAdd now resolves ipFamily in three layers: PrevResult.IPs → passive iface scan → router-solicited active scan. The active path is only exercised when the prior two come up v4-only, so the latency cost is paid only when the fix is the user's last chance for v6 routing. Integration tests cover: empty-input error, missing-iface error, no- router timeout-without-error, and the early-return when a v6 is already present (regression case for setups that don't need the active path). Refs #5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com> * coordinator: gate the RS solicit on the kernel's accept_ra/forwarding state Address two style-review nits and prevent a v4-only regression. Gate: SolicitRouterAndWaitForSLAACv6 now reads /proc/sys/net/ipv6/conf/<iface>/{accept_ra,forwarding} and skips the RS unless the kernel's ipv6_accept_ra predicate (net/ipv6/addrconf.c) would process the elicited RA. Before this, v4-only setups paid the full ~3s solicit timeout on every CmdAdd despite the kernel ignoring the response. GoDoc on SolicitRouterAndWaitForSLAACv6 trimmed from a 37-line RFC-cited block to ~10 lines, matching the density of sibling exported helpers in pkg/networking/networking/packet.go. The longer-form rationale lives in the PR description and #5618. integration/doc.go expanded with the gomonkey + testutils.NewNS() OS-thread-leak rationale so reviewers see why this subpackage exists (no precedent for `<pkg>/integration` in the repo otherwise). Refs #5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com> * coordinator: clarify the kernel-drops-RA wording, note caller contract, test the gate Three small follow-ups from review: - Reword the SolicitRouterAndWaitForSLAACv6 docstring to say the kernel drops (not ignores) the early periodic RAs, and explicitly state the caller contract: accept_ra must be permissive on the iface before the function is invoked (typically via the upstream tuning plugin earlier in the NAD), otherwise the function exits without sending an RS. - Add an integration test that exercises the gate's block path (accept_ra=1 + forwarding=1 → returns before Phase 1's first poll). Wraps sysctl writes in a Skip() guard so the test runs in CI (sudo on a real Linux host) but cleanly skips on contributor laptops where /proc/sys is read-only. Refs #5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com> * coordinator: revert getAdders filter; drop active RS solicit per review Two fixes rolled together because they touch adjacent code: 1. Revert the IFA_F_TENTATIVE/IFA_F_DADFAILED/IsLoopback/IsUnspecified filters in getAdders. The TENTATIVE filter broke setupNeighborhood's stale-entry cleanup — c.currentAddress now drops the pod's v6 while DAD is in progress, so the pre-populated stale neighbor on the host is never matched and the entry isn't deleted. Observable as e2e MacvlanOverlayOne / "auto clean up the dirty rules" timing out on dual / ipv6 matrix slots. The standards-review reasoning that "tentative is not assigned per RFC 4862 §5.4" is technically correct, but the rest of the coordinator codebase intentionally uses tentative addresses for neighbor / route bookkeeping (the pod is assigned that IP even if DAD hasn't completed). Filtering at the getAdders layer broke that contract. 2. Drop SolicitRouterAndWaitForSLAACv6 + defaultSLAACSolicitTimeout + the active-path call site in CmdAdd per @cyclinder's review on #5619. Coordinator now only does the passive iface scan when PrevResult.IPs lacks v6. Comment #1 (drop the prevResultFamily re-parse and the familyDiscovery tracking) folded in. The active RS remains the right fix for setups hitting the macvlan→tuning accept_ra-flip-after-link-up timing race where the kernel drops the early periodic RAs and won't auto-resend RS. Revisitable as a separate CNI plugin or follow-up PR after this lands. Refs #5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com> * networking/integration: dedupe redundant DescribeTable entry The previous "v4-only IPAM, no v6 on iface" and "v4 IPAM + only link-local v6" entries had identical inputs (prev=[v4], ifaceV4=[v4], ifaceV6=nil) and the same expected output (FAMILY_V4). The kernel auto-assigns fe80:: on every dummy iface at link-up regardless of what we add, so the "LL filtered" semantic was already covered by the other entry. Collapsed into a single entry with the explanatory description. Refs #5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com> * coordinator: read iface for ipFamily, drop PrevResult parsing path Per @cyclinder's review: scanning the iface in the pod netns is the universal way to determine ipFamily — works for any IPAM and for kernel SLAAC alike, simpler than the parse-PrevResult-then-fallback machinery this PR previously added. - Add GetIPFamilyByIface(netns, ifName) in pkg/networking/networking/ip.go. - Remove GetIPFamilyByResultWithIface and the fallback branches it had. - command_add.go calls the new helper directly; PrevResult parsing in CmdAdd is now just a shape-validation pass. - GetIPFamilyByResult (pre-existing pure parser) is untouched for backward compat. Unit tests for GetIPFamilyByIface narrowed to input validation (nil netns, empty ifName) since the behavioral coverage (v4 / v6 / dual / no-addrs / missing-iface) is exercised against real netlink in the integration suite; gomonkey-based behavioral mocking proved flaky. Refs #5618 Signed-off-by: Kevin Park <kevin.park1217@gmail.com> --------- Signed-off-by: Kevin Park <kevin.park1217@gmail.com> * Add IaaS network provider integration support (#5573) * Add IaaS network provider integration support * add IaaS provider configuration parameters to values.yaml and README * mount IaaS TLS secret to controller and agent pods * add IaaS provider config to configmap * implement IaaS client with mTLS support for allocate/release IP operations * add IaaS config validation in controller and agent daemon startup * integrate IaaS client into IPAM workflow Signed-off-by: Cyclinder Kuo <qifeng.guo@daocloud.io> * Add vlan-cni plugin support to spiderpool-plugins image * add VLAN_VERSION build argument and environment variable * clone vlan-cni repository and build vlan binary * copy vlan binary to release image at /usr/plugins/vlan * add --install-vlan flag to entrypoint.sh for conditional installation * add vlan plugin installation logic with version logging * export VLAN_VERSION in GitHub Actions workflow outputs * set default VLAN_VERSION to 0.0.1 in version.sh Signed-off-by: Cyclinder Kuo <qifeng.guo@daocloud.io> * fix: move link setup from CmdAdd to DetectIPConflictAndGatewayReachable * move netlink link setup logic from command_add.go to ipam_detection.go * set link up inside DetectIPConflictAndGatewayReachable before detection * change early return to continue when IP version is nil during detection * remove netlink import from command_add.go Signed-off-by: Cyclinder Kuo <qifeng.guo@daocloud.io> --------- Signed-off-by: Cyclinder Kuo <qifeng.guo@daocloud.io> * bump version to v1.2.0 --------- Signed-off-by: weizhoublue <weizhou.lan@daocloud.io> Signed-off-by: Kevin Park <kevin.park1217@gmail.com> Signed-off-by: Cyclinder Kuo <qifeng.guo@daocloud.io> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Kevin Park <kevin.park1217@gmail.com>
Thanks for contributing!
Notice:
"release/none"
"release/bug"
"release/feature"
What issue(s) does this PR fix:
Refs #5618 (primary family-detection bug fixed; timing-race edge tracked separately)
Special notes for your reviewer:
GetIPFamilyByResultderivesipFamilyonly fromPrevResult.IPs. Kernel-SLAAC v6 never goes through CNI IPAM, so dual-stack macvlan setups that rely on RA-driven SLAAC for v6 resolve toFAMILY_V4and the coordinator silently skips every v6 path (setupHostRoutes,v6HijackRouteGw, the v6 sysctl). Full repro in #5618.Fix
Add
GetIPFamilyByResultWithIface(prevResult, netns, ifName)inpkg/networking/networking/ip.go. It wraps the existing pure parser; whenPrevResult.IPslacks v6 it scans the pod iface in the pod netns (viaIPAddressByName) for non-link-local v6 and upgrades the family accordingly.CmdAddopens the pod netns earlier and uses the new helper.PrevResult.IPsipFamilyFAMILY_ALLFAMILY_V4)FAMILY_V6GetIPFamilyByResultis kept unchanged so callers without a netns continue to work.Tests
pkg/networking/networking/ip_test.go): every branch ofGetIPFamilyByResultandGetIPFamilyByResultWithIface.pkg/networking/networking/integration/): the 6PrevResult × ifacescenarios from the table above against a real dummy iface in a fresh netns.Integration tests live in their own ginkgo suite because
testutils.NewNSleaks an OS thread by design (seecontainernetworking/plugins/pkg/testutils/netns_linux.go:116-118) and the resulting thread churn destabilizes gomonkey machine-code patches in adjacent specs when mixed in one binary. Separate test binaries fully isolate the two test styles; the standardmake unittest-tests(whichlint-golang.yamlruns on every PR) picks up both viaginkgo -r ./pkg ./cmd.go vet,check-go-fmt.sh,lock-check.sh, andgolangci-lint runall pass cleanly on the new code (two pre-existinggofumptwarnings onpkg/networking/networking/ip.go:104andpkg/networking/networking/packet.go:315are present onmain, untouched here).Release note: