Skip to content

coordinator: detect SLAAC v6 on iface when PrevResult.IPs lacks v6#5619

Merged
cyclinder merged 9 commits into
spidernet-io:mainfrom
kevinpark1217:fix/5618-coordinator-slaac-v6-detection
May 28, 2026
Merged

coordinator: detect SLAAC v6 on iface when PrevResult.IPs lacks v6#5619
cyclinder merged 9 commits into
spidernet-io:mainfrom
kevinpark1217:fix/5618-coordinator-slaac-v6-detection

Conversation

@kevinpark1217

@kevinpark1217 kevinpark1217 commented May 10, 2026

Copy link
Copy Markdown
Contributor

Thanks for contributing!

Notice:

What issue(s) does this PR fix:

Refs #5618 (primary family-detection bug fixed; timing-race edge tracked separately)

Special notes for your reviewer:

GetIPFamilyByResult derives ipFamily only from PrevResult.IPs. Kernel-SLAAC v6 never goes through CNI IPAM, so dual-stack macvlan setups that rely on RA-driven SLAAC for v6 resolve to FAMILY_V4 and the coordinator silently skips every v6 path (setupHostRoutes, v6HijackRouteGw, the v6 sysctl). Full repro in #5618.

Fix

Add GetIPFamilyByResultWithIface(prevResult, netns, ifName) in pkg/networking/networking/ip.go. It wraps the existing pure parser; when PrevResult.IPs lacks v6 it scans the pod iface in the pod netns (via IPAddressByName) for non-link-local v6 and upgrades the family accordingly. CmdAdd opens the pod netns earlier and uses the new helper.

PrevResult.IPs Iface scan finds v6? Returned ipFamily
has v6 (any combo) (not scanned) unchanged
v4 only yes upgraded to FAMILY_ALL
v4 only no / scan errors unchanged (FAMILY_V4)
empty yes FAMILY_V6
empty no / scan errors original parse error (scan error wrapped)

GetIPFamilyByResult is kept unchanged so callers without a netns continue to work.

Tests

$ bash tools/scripts/ginkgo.sh -r ./pkg/networking
Networking Suite                15/15 specs SUCCESS  (unit, gomonkey-mocked)
Networking Integration Suite     6/6  specs SUCCESS  (real netns via testutils.NewNS)
  • Unit (pkg/networking/networking/ip_test.go): every branch of GetIPFamilyByResult and GetIPFamilyByResultWithIface.
  • Integration (pkg/networking/networking/integration/): the 6 PrevResult × iface scenarios from the table above against a real dummy iface in a fresh netns.

Integration tests live in their own ginkgo suite because testutils.NewNS leaks an OS thread by design (see containernetworking/plugins/pkg/testutils/netns_linux.go:116-118) and the resulting thread churn destabilizes gomonkey machine-code patches in adjacent specs when mixed in one binary. Separate test binaries fully isolate the two test styles; the standard make unittest-tests (which lint-golang.yaml runs on every PR) picks up both via ginkgo -r ./pkg ./cmd.

go vet, check-go-fmt.sh, lock-check.sh, and golangci-lint run all pass cleanly on the new code (two pre-existing gofumpt warnings on pkg/networking/networking/ip.go:104 and pkg/networking/networking/packet.go:315 are present on main, untouched here).

Release note:

fix(coordinator): detect SLAAC IPv6 on the pod interface when PrevResult.IPs lacks v6, so v6 host routes / hijack gateways / sysctls are configured for setups that rely on kernel SLAAC (e.g. DHCPv4 + RA-driven SLAAC v6 on macvlan).

PrevResult.IPs is populated only by upstream IPAMs. v6 addresses obtained
via kernel SLAAC (RA-driven autoconf) never appear there even though they
are configured on the macvlan child, which leaves ipFamily stuck at
FAMILY_V4 and silently skips every v6 path in the coordinator: host /128
routes, hijack-route gateway, and sysctl reconciliation.

Add GetIPFamilyByResultWithIface that wraps the existing pure parser and
falls back to scanning the pod iface (in args.Netns) for non-link-local
v6 addresses. The coordinator opens the pod netns up front and uses the
new helper, so ipFamily resolves to FAMILY_ALL for the common
"DHCPv4 IPAM + kernel SLAAC v6" macvlan setup. CmdAdd logs at Info when
the fallback fires so operators can see why the family was upgraded.

The pure GetIPFamilyByResult is kept unchanged for callers without a
netns. New unit tests cover the parsing paths and every branch of the
fallback (iface scan mocked via gomonkey).

Fixes spidernet-io#5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
@kevinpark1217 kevinpark1217 force-pushed the fix/5618-coordinator-slaac-v6-detection branch from 1557a50 to c34c768 Compare May 10, 2026 18:38
… getAdders

RFC 4862 §5.4 ("Duplicate Address Detection") says a tentative address is
not considered assigned to an interface, and §5.4.5 says a DAD-failed
address MUST NOT be assigned. Without filtering these in getAdders, the
SLAAC v6 fallback path could install host /128 routes and NUD_PERMANENT
neighbor entries against an address that the kernel removes moments
later — observable in a small window during CNI ADD on busy macvlan
parents (vishvananda/netlink.AddrList does not flag-filter).

While here, also drop loopback (::1) and unspecified (::) — RFC 4291
§2.5.2/§2.5.3 reserved, IANA Source=False/Destination=False. They're
excluded by accident today because all current call sites filter "^lo$",
but adding them to the in-place predicate matches Calico's host-iface
filter and protects future refactors. Zero behavior change on the
documented call paths.

Verified against real Linux DAD on a veth pair (local integration test):
tentative SLAAC v6 is correctly filtered until DAD completes (~1-2s),
then picked up.

Refs spidernet-io#5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
Mirrors the test pattern in cmd/spiderpool/cmd/command_test.go: standard
ginkgo _test.go (no build tag), label "unittest" so it's picked up by
make unittest-tests, real netns via testutils.NewNS, real interfaces via
netlink.LinkAdd.

Covers the same scenarios as the mocked unit tests plus one the mocks
can't exercise: a SLAAC v6 added without IFA_F_NODAD on a veth pair —
verifies the IFA_F_TENTATIVE filter ignores the address until kernel DAD
completes, then picks it up (RFC 4862 §5.4).

Lives in its own ginkgo suite (pkg/networking/networking/integration)
rather than the same _test.go as the mocked unit tests. testutils.NewNS
intentionally does not UnlockOSThread (containernetworking/plugins
testutils/netns_linux.go:116-118 — Go ≥1.10 kills the locked thread when
the goroutine exits), and the resulting OS-thread churn was making
gomonkey machine-code patches in sibling specs flaky depending on
ginkgo's spec randomization. Separate test binaries fully isolate the
two test styles; verified stable across 10 consecutive seeded runs.

Refs spidernet-io#5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
The passive iface scan in spidernet-io#5618's first commit only helps setups where the
SLAAC v6 is already on the iface at CNI-ADD time. For the common case
where it isn't yet — multus chains where macvlan brings the iface up
before tuning sets accept_ra=2, combined with the pod inheriting
net.ipv4.ip_forward=1 (k3s default), so the kernel skips the link-up RS
and ignores any periodic RA at accept_ra=1 (net/ipv6/addrconf.c
ipv6_accept_ra: forwarding=1 + accept_ra<2 returns false) — the passive
fix is a no-op.

Add SolicitRouterAndWaitForSLAACv6: sends an ICMPv6 Router Solicitation
to ff02::2 from inside the pod netns and polls AddrList until a usable
non-link-local v6 appears (or timeout). Two-phase wait under one
timeout budget:

  1. Wait for a non-tentative link-local (RFC 4862 §5.4: kernel refuses
     to source from a tentative address — observable as EADDRNOTAVAIL on
     sendmsg) — typically <1.5s.
  2. Send the RS (RFC 4861 §4.1, §6.3.5) and poll for the SLAAC GUA.

CmdAdd now resolves ipFamily in three layers: PrevResult.IPs → passive
iface scan → router-solicited active scan. The active path is only
exercised when the prior two come up v4-only, so the latency cost is
paid only when the fix is the user's last chance for v6 routing.

Integration tests cover: empty-input error, missing-iface error, no-
router timeout-without-error, and the early-return when a v6 is already
present (regression case for setups that don't need the active path).

Refs spidernet-io#5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
… state

Address two style-review nits and prevent a v4-only regression.

Gate: SolicitRouterAndWaitForSLAACv6 now reads
/proc/sys/net/ipv6/conf/<iface>/{accept_ra,forwarding} and skips the RS
unless the kernel's ipv6_accept_ra predicate (net/ipv6/addrconf.c) would
process the elicited RA. Before this, v4-only setups paid the full ~3s
solicit timeout on every CmdAdd despite the kernel ignoring the response.

GoDoc on SolicitRouterAndWaitForSLAACv6 trimmed from a 37-line RFC-cited
block to ~10 lines, matching the density of sibling exported helpers in
pkg/networking/networking/packet.go. The longer-form rationale lives in
the PR description and spidernet-io#5618.

integration/doc.go expanded with the gomonkey + testutils.NewNS()
OS-thread-leak rationale so reviewers see why this subpackage exists
(no precedent for `<pkg>/integration` in the repo otherwise).

Refs spidernet-io#5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
…t, test the gate

Three small follow-ups from review:

- Reword the SolicitRouterAndWaitForSLAACv6 docstring to say the kernel
  drops (not ignores) the early periodic RAs, and explicitly state the
  caller contract: accept_ra must be permissive on the iface before the
  function is invoked (typically via the upstream tuning plugin earlier
  in the NAD), otherwise the function exits without sending an RS.

- Add an integration test that exercises the gate's block path
  (accept_ra=1 + forwarding=1 → returns before Phase 1's first poll).
  Wraps sysctl writes in a Skip() guard so the test runs in CI (sudo on
  a real Linux host) but cleanly skips on contributor laptops where
  /proc/sys is read-only.

Refs spidernet-io#5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
@kevinpark1217 kevinpark1217 force-pushed the fix/5618-coordinator-slaac-v6-detection branch from 6eeabd6 to 2383ae4 Compare May 11, 2026 02:58
Comment thread cmd/coordinator/cmd/command_add.go Outdated
Comment thread cmd/coordinator/cmd/command_add.go Outdated
logger.Error("failed to GetIPFamilyByResultWithIface", zap.Error(err))
return err
}
familyDiscovery := "PrevResult.IPs"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? it's confusing. I think the good way is directly reading the iface's ips in the net namespace, we don't need the two ways.

@kevinpark1217 kevinpark1217 May 11, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair — dropped in dee8aaf8e. SolicitRouterAndWaitForSLAACv6, the accept_ra/forwarding gate, defaultSLAACSolicitTimeout, and the related integration tests are all removed. PR is now just the passive iface scan you described.

Brief context on why the active path was here in the first place, in case useful for a follow-up: there's a CNI ADD race for pods that inherit forwarding=1 (k3s default and most other CNIs). Macvlan brings the iface up while accept_ra=1; the kernel's ipv6_accept_ra predicate in net/ipv6/addrconf.c evaluates false when forwarding != 0 && accept_ra < 2, so RAs are silently dropped. Tuning later flips accept_ra=2, but the kernel does not auto-retransmit RS on that change — SLAAC waits for the next periodic RA, which on slow-RA networks can be tens of seconds to several minutes. Coordinator runs ms after tuning, before that next RA, so a passive iface scan finds no v6 for these setups. An explicit RS forces an immediate RA; the kernel then runs SLAAC and the GUA appears within ms.

Happy to revisit either as a separate CNI plugin chained between tuning and coordinator, or as an opt-in field on the coordinator's CNI config — whichever fits the design better. Let me know if you'd like a follow-up issue/PR.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that in this scenario, we must set accept_ra = 2? Is this a safe sysctl parameter? Would it cause other potential side effects? If it is considered safe, we can have spiderpool-agent set this value to 2 for each node upon startup. This is currently supported—refer to the Spiderpool documentation—and we can try adding this parameter.

Under these circumstances, we should no longer need to actively send RS (Router Solicitation) instructions, right? Of course, I don't mind including that logic, as we have similar mechanisms in the coordinator or IPAM (e.g., actively sending Gratuitous ARP packets to announce IP addresses). For now, we can hold off on introducing this logic if it isn't strictly necessary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried this; the agent path doesn't actually close the race. Kernel-side reasoning:

Why host-side accept_ra=2 doesn't reach pods. spiderpool-agent runs hostNetwork: true (daemonset.yaml:51); SetSysctl writes /proc/sys/... with no netns.Do (pkg/networking/sysctl/sysctl.go:83). net.ipv6.conf.*.accept_ra is per-netns — each struct net mounts its own table via register_net_sysctl_sz(net, ...) in __addrconf_sysctl_register (net/ipv6/addrconf.c:7224). And when macvlan creates the child in the pod netns, ipv6_add_dev() initializes its cnf from the pod netns's devconf_dflt (addrconf.c:392), not the host's. Adding accept_ra=2 to DefaultSysctlConfig would only flip host-side ifaces.

Why a pod-side flip after link-up also wouldn't close it. accept_ra uses plain proc_dointvec (addrconf.c:6754) — no side-effect hook. ndisc_send_rs() is only invoked from addrconf_dad_completed (link-up DAD) or the rs_timer (only armed if RS was already sent). At link-up, with macvlan's default accept_ra=1 + pod's inherited forwarding=1, ipv6_accept_ra() (include/net/ipv6.h:537) returned false, so neither RS nor rs_timer fired. Tuning flipping accept_ra=2 afterward leaves the iface in steady state — no kernel path resurrects RS. That's the timing race in #5618.

accept_ra=2 itself is safe and idiomatic for K8s workers with forwarding=1 — Talos defaults to it; Moby#43466 advocates the same. Happy to add it to DefaultSysctlConfig as hardening, but I wouldn't frame it as the close for #5618.

For this PR: leave it at passive iface-scan only (already the state). The active RS can come back as an opt-in coordinator field in a follow-up if the race shows up in practice — precedent is the Gratuitous-ARP / Unsolicited-NA paths you mentioned.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the in-depth analysis. I’ve reviewed the kernel source code and conducted tests on my Ubuntu 22.04 machine: the accept_ra sysctl parameter is indeed not inherited by containers; even if the host-side sysctl is modified, it will not take effect inside the container. Therefore, the root cause is now clear.

From a design perspective, the Coordinator is primarily intended to work in tandem with the Spiderpool IPAM. Since the Spiderpool IPAM explicitly allocates an IP from an IPPool to a Pod, it is perfectly valid for the Coordinator to determine the IP family based on preResults. As long as any IPAM populates its allocation results into preResults, the Coordinator can remain compatible with it. I believe this is a standard and universal approach.

Regarding the timing of sending an RS (Router Solicitation): what exactly should trigger it? It seems difficult to distinguish whether an RS should be sent in all cases. In real-world scenarios, IPv6 might not even be enabled. There are various edge cases where this logic should not be executed, and I want to ensure that this implementation has no negative impact on the existing codebase.

Furthermore, I believe the best solution is for IPAM plugins to support returning results, as described in containernetworking/plugins issue #1016. Alternatively, this logic could be written as a custom plugin to be executed prior to the Coordinator.

What are your thoughts on this?

@kevinpark1217 kevinpark1217 May 14, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on the architecture. Upstream consensus is clear: squeed (CNI maintainer) endorsed a 4-step SLAAC-IPAM recipe in containernetworking/plugins#1016 (bring up iface → wait for SLAAC → disable SLAAC, manually add the address → return in CNI status) and offered to merge it in 2024; nobody's written it. KEP-563 (dual-stack) and the dual-stack docs require one v6 per Pod returned at CNI ADD, structurally incompatible with SLAAC's async/renumberable lifecycle. Every major CNI (Calico, Cilium, kube-ovn, flannel, AWS-VPC, PTP) sets accept_ra=0 to enforce this. Coordinator-side active RS isn't where this belongs.

This PR is a narrow correctness fix to family detection, not SLAAC support. The passive iface-scan only fires when PrevResult.IPs lacks v6 and the address is already on the iface by CNI-ADD time — third-party IPAMs that don't fully populate the result, or the subset where SLAAC happened to land before coordinator runs. No-op for the IPAM-driven case. The active RS was already dropped.

The timing race itself belongs in an IPAM plugin per containernetworking/plugins#1016 or in a custom pre-Coordinator plugin, as you noted.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinpark1217 Sorry for the delay. I think a better approach is to directly read the IP addresses from the interface to determine the IP address family, without relying on the PreResults. This improves code readability and reduces complexity. It may introduce some performance overhead (NetNS operations), but it is applicable to scenarios with different IPAMs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — done in 950448f08. The coordinator now resolves ipFamily via a new GetIPFamilyByIface(netns, ifName) that reads the iface in the pod netns directly; GetIPFamilyByResultWithIface and the parse-then-fallback branches are gone. GetIPFamilyByResult (the pre-existing pure parser) is untouched.

Net diff this commit: +55 / -229 across 4 files. Unit tests for the new helper narrowed to input validation; behavioral coverage runs against real netlink in the integration suite (the gomonkey-based mocks were flaky in isolation).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the code looks good now.

Two fixes rolled together because they touch adjacent code:

1. Revert the IFA_F_TENTATIVE/IFA_F_DADFAILED/IsLoopback/IsUnspecified
   filters in getAdders. The TENTATIVE filter broke setupNeighborhood's
   stale-entry cleanup — c.currentAddress now drops the pod's v6 while
   DAD is in progress, so the pre-populated stale neighbor on the host
   is never matched and the entry isn't deleted. Observable as e2e
   MacvlanOverlayOne / "auto clean up the dirty rules" timing out on
   dual / ipv6 matrix slots. The standards-review reasoning that
   "tentative is not assigned per RFC 4862 §5.4" is technically correct,
   but the rest of the coordinator codebase intentionally uses tentative
   addresses for neighbor / route bookkeeping (the pod is assigned that
   IP even if DAD hasn't completed). Filtering at the getAdders layer
   broke that contract.

2. Drop SolicitRouterAndWaitForSLAACv6 + defaultSLAACSolicitTimeout +
   the active-path call site in CmdAdd per @cyclinder's review on spidernet-io#5619.
   Coordinator now only does the passive iface scan when PrevResult.IPs
   lacks v6. Comment spidernet-io#1 (drop the prevResultFamily re-parse and the
   familyDiscovery tracking) folded in.

The active RS remains the right fix for setups hitting the
macvlan→tuning accept_ra-flip-after-link-up timing race where the kernel
drops the early periodic RAs and won't auto-resend RS. Revisitable as a
separate CNI plugin or follow-up PR after this lands.

Refs spidernet-io#5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
The previous "v4-only IPAM, no v6 on iface" and "v4 IPAM + only link-local
v6" entries had identical inputs (prev=[v4], ifaceV4=[v4], ifaceV6=nil)
and the same expected output (FAMILY_V4). The kernel auto-assigns fe80::
on every dummy iface at link-up regardless of what we add, so the "LL
filtered" semantic was already covered by the other entry. Collapsed
into a single entry with the explanatory description.

Refs spidernet-io#5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
@codecov

codecov Bot commented May 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 15.00000% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.45%. Comparing base (a9d46e6) to head (950448f).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
pkg/networking/networking/ip.go 15.00% 17 Missing ⚠️

❌ Your patch check has failed because the patch coverage (15.00%) is below the target coverage (60.00%). You can increase the patch coverage or adjust the target coverage.

❗ There is a different number of reports uploaded between BASE (a9d46e6) and HEAD (950448f). Click for more details.

HEAD has 16 uploads less than BASE
Flag BASE (a9d46e6) HEAD (950448f)
unittests 17 1
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5619      +/-   ##
==========================================
- Coverage   66.27%   59.45%   -6.82%     
==========================================
  Files          61       68       +7     
  Lines        8018     7279     -739     
==========================================
- Hits         5314     4328     -986     
- Misses       2421     2668     +247     
  Partials      283      283              
Flag Coverage Δ
unittests 59.45% <15.00%> (-6.82%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pkg/networking/networking/ip.go 12.26% <15.00%> (ø)

... and 65 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kevinpark1217

kevinpark1217 commented May 13, 2026

Copy link
Copy Markdown
Contributor Author

@cyclinder — review comments addressed (re-parse removed, active RS dropped per your preference). Replied inline on the two threads.

The one failing CI check (e2e_ipv4_latest / calico) looks unrelated — kdoctor's own admission webhook timed out; same code passed in all 15 other matrix slots. Could you re-run?

@cyclinder

Copy link
Copy Markdown
Collaborator

The one failing CI check (e2e_ipv4_latest / calico) looks unrelated — kdoctor's own admission webhook timed out; same code passed in all 15 other matrix slots. Could you re-run?

Yes, it's unrelated.

@kevinpark1217

Copy link
Copy Markdown
Contributor Author

@cyclinder — any further feedback on this PR? Or is this good to merge as the initial IPv6 family-detection fix?

Per @cyclinder's review: scanning the iface in the pod netns is the
universal way to determine ipFamily — works for any IPAM and for kernel
SLAAC alike, simpler than the parse-PrevResult-then-fallback machinery
this PR previously added.

  - Add GetIPFamilyByIface(netns, ifName) in pkg/networking/networking/ip.go.
  - Remove GetIPFamilyByResultWithIface and the fallback branches it had.
  - command_add.go calls the new helper directly; PrevResult parsing in
    CmdAdd is now just a shape-validation pass.
  - GetIPFamilyByResult (pre-existing pure parser) is untouched for
    backward compat.

Unit tests for GetIPFamilyByIface narrowed to input validation (nil
netns, empty ifName) since the behavioral coverage (v4 / v6 / dual /
no-addrs / missing-iface) is exercised against real netlink in the
integration suite; gomonkey-based behavioral mocking proved flaky.

Refs spidernet-io#5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
@kevinpark1217

Copy link
Copy Markdown
Contributor Author

Thanks @cyclinder! What else is left to get this merged? Do I need another review from @weizhoublue or @lou-lan?

@cyclinder

Copy link
Copy Markdown
Collaborator

@kevinpark1217 Yes, it would be better to have the reviews from other maintainers.

/cc @weizhoublue @lou-lan

@cyclinder

Copy link
Copy Markdown
Collaborator

@kevinpark1217 Thanks for your contribution!

@cyclinder cyclinder merged commit a952178 into spidernet-io:main May 28, 2026
44 of 47 checks passed
@kevinpark1217 kevinpark1217 deleted the fix/5618-coordinator-slaac-v6-detection branch May 28, 2026 20:00
cyclinder added a commit that referenced this pull request May 29, 2026
* bump version to v1.2.0-rc1 (#5583)

* robot updates the release version of the README file based on the release tag: v1.1.2 (#5592)

Signed-off-by: weizhoublue <weizhou.lan@daocloud.io>
Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

* robot updates the release version of the README file based on the release tag: v1.0.6 (#5588)

Signed-off-by: weizhoublue <weizhou.lan@daocloud.io>
Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>
Co-authored-by: Cyclinder <qifeng.guo@daocloud.io>

* fix: make Calico installation compatible with latest Calico and older Kubernetes versions (#5610)

* Initial plan

* fix: replace immutable Calico IPPool CIDR patch with delete+recreate

In newer Calico versions (v3.30+), spec.cidr is immutable and cannot be
changed via kubectl patch. The CI was failing with:

  The IPPool "default-ipv4-ippool" is invalid: spec.cidr: Invalid value:
  "string": CIDR cannot be changed; follow IP pool migration guide to
  avoid corruption.

Replace the kubectl patch approach with a replace_ippool_cidr() helper
function that:
1. Exports the current IPPool YAML
2. Strips server-managed fields
3. Updates spec.cidr to the desired value
4. Validates the new YAML with --dry-run=client before destructive ops
5. Deletes the existing pool
6. Recreates it with the updated YAML

Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/59cffb22-cbbc-4471-a713-16f925218051

Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

* fix: strip x-kubernetes-validations from Calico CRDs for older K8s compatibility

Calico v3.32.0 CRDs use x-kubernetes-validations with:
- CEL functions like isCIDR() that require Kubernetes 1.29+
- Fields like 'reason'/'messageExpression' that require K8s 1.28+

These cause strict decoding errors and CEL evaluation failures when applying
Calico CRDs on older K8s versions (e.g. K8s v1.27.1) used in the test matrix.

Strip all x-kubernetes-validations from the downloaded Calico YAML before
applying it. This only removes CRD input validation hints; it does not affect
Calico's core networking functionality.

Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/d8d45c5c-363b-499a-b09f-0cf90ec8d056

Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

* CI: Fix kdoctor connection issue (#5632)

* Stabilize Nightly K8s Matrix E2E cleanup by making IPPool deletion idempotent (#5636)

* Initial plan

* test(e2e): ignore notfound when deleting ippool by name

Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/f9f1436f-70c0-4d93-8f46-ad7753b01d71

Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

* chore: finalize nightly k8s matrix ci fix status

Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/f9f1436f-70c0-4d93-8f46-ad7753b01d71

Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

* chore: remove accidental local build artifacts

Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/f9f1436f-70c0-4d93-8f46-ad7753b01d71

Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

* Stabilize subnet e2e IPPool record assertions in nightly IPv4 run (#5637)

* Initial plan

* test(e2e): stabilize subnet ippool record checks with retries

Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/89afecd4-2b4f-400f-bcb7-8930fa84ff30

Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

* chore: revert unintended binary build artifacts

Agent-Logs-Url: https://github.com/spidernet-io/spiderpool/sessions/89afecd4-2b4f-400f-bcb7-8930fa84ff30

Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: cyclinder <59680092+cyclinder@users.noreply.github.com>

* Stabilize annotation E2E against transient namespace visibility race (#5642)

* Initial plan

* test(e2e): retry pod creation when namespace cache not ready

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

* Stabilize KubeVirt install in nightly e2e by making virt-operator rollout timeout configurable (#5641)

* Initial plan

* test: make kubevirt rollout timeout configurable for CI stability

* chore: remove unintended build artifacts

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

* coordinator: detect SLAAC v6 on iface when PrevResult.IPs lacks v6 (#5619)

* coordinator: detect SLAAC v6 via iface scan when PrevResult.IPs lacks v6

PrevResult.IPs is populated only by upstream IPAMs. v6 addresses obtained
via kernel SLAAC (RA-driven autoconf) never appear there even though they
are configured on the macvlan child, which leaves ipFamily stuck at
FAMILY_V4 and silently skips every v6 path in the coordinator: host /128
routes, hijack-route gateway, and sysctl reconciliation.

Add GetIPFamilyByResultWithIface that wraps the existing pure parser and
falls back to scanning the pod iface (in args.Netns) for non-link-local
v6 addresses. The coordinator opens the pod netns up front and uses the
new helper, so ipFamily resolves to FAMILY_ALL for the common
"DHCPv4 IPAM + kernel SLAAC v6" macvlan setup. CmdAdd logs at Info when
the fallback fires so operators can see why the family was upgraded.

The pure GetIPFamilyByResult is kept unchanged for callers without a
netns. New unit tests cover the parsing paths and every branch of the
fallback (iface scan mocked via gomonkey).

Fixes #5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>

* coordinator: filter tentative/DAD-failed and reserved v6 addresses in getAdders

RFC 4862 §5.4 ("Duplicate Address Detection") says a tentative address is
not considered assigned to an interface, and §5.4.5 says a DAD-failed
address MUST NOT be assigned. Without filtering these in getAdders, the
SLAAC v6 fallback path could install host /128 routes and NUD_PERMANENT
neighbor entries against an address that the kernel removes moments
later — observable in a small window during CNI ADD on busy macvlan
parents (vishvananda/netlink.AddrList does not flag-filter).

While here, also drop loopback (::1) and unspecified (::) — RFC 4291
§2.5.2/§2.5.3 reserved, IANA Source=False/Destination=False. They're
excluded by accident today because all current call sites filter "^lo$",
but adding them to the in-place predicate matches Calico's host-iface
filter and protects future refactors. Zero behavior change on the
documented call paths.

Verified against real Linux DAD on a veth pair (local integration test):
tentative SLAAC v6 is correctly filtered until DAD completes (~1-2s),
then picked up.

Refs #5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>

* networking: add real-netns integration tests for SLAAC v6 fallback

Mirrors the test pattern in cmd/spiderpool/cmd/command_test.go: standard
ginkgo _test.go (no build tag), label "unittest" so it's picked up by
make unittest-tests, real netns via testutils.NewNS, real interfaces via
netlink.LinkAdd.

Covers the same scenarios as the mocked unit tests plus one the mocks
can't exercise: a SLAAC v6 added without IFA_F_NODAD on a veth pair —
verifies the IFA_F_TENTATIVE filter ignores the address until kernel DAD
completes, then picks it up (RFC 4862 §5.4).

Lives in its own ginkgo suite (pkg/networking/networking/integration)
rather than the same _test.go as the mocked unit tests. testutils.NewNS
intentionally does not UnlockOSThread (containernetworking/plugins
testutils/netns_linux.go:116-118 — Go ≥1.10 kills the locked thread when
the goroutine exits), and the resulting OS-thread churn was making
gomonkey machine-code patches in sibling specs flaky depending on
ginkgo's spec randomization. Separate test binaries fully isolate the
two test styles; verified stable across 10 consecutive seeded runs.

Refs #5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>

* coordinator: actively solicit an RA when SLAAC v6 hasn't arrived yet

The passive iface scan in #5618's first commit only helps setups where the
SLAAC v6 is already on the iface at CNI-ADD time. For the common case
where it isn't yet — multus chains where macvlan brings the iface up
before tuning sets accept_ra=2, combined with the pod inheriting
net.ipv4.ip_forward=1 (k3s default), so the kernel skips the link-up RS
and ignores any periodic RA at accept_ra=1 (net/ipv6/addrconf.c
ipv6_accept_ra: forwarding=1 + accept_ra<2 returns false) — the passive
fix is a no-op.

Add SolicitRouterAndWaitForSLAACv6: sends an ICMPv6 Router Solicitation
to ff02::2 from inside the pod netns and polls AddrList until a usable
non-link-local v6 appears (or timeout). Two-phase wait under one
timeout budget:

  1. Wait for a non-tentative link-local (RFC 4862 §5.4: kernel refuses
     to source from a tentative address — observable as EADDRNOTAVAIL on
     sendmsg) — typically <1.5s.
  2. Send the RS (RFC 4861 §4.1, §6.3.5) and poll for the SLAAC GUA.

CmdAdd now resolves ipFamily in three layers: PrevResult.IPs → passive
iface scan → router-solicited active scan. The active path is only
exercised when the prior two come up v4-only, so the latency cost is
paid only when the fix is the user's last chance for v6 routing.

Integration tests cover: empty-input error, missing-iface error, no-
router timeout-without-error, and the early-return when a v6 is already
present (regression case for setups that don't need the active path).

Refs #5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>

* coordinator: gate the RS solicit on the kernel's accept_ra/forwarding state

Address two style-review nits and prevent a v4-only regression.

Gate: SolicitRouterAndWaitForSLAACv6 now reads
/proc/sys/net/ipv6/conf/<iface>/{accept_ra,forwarding} and skips the RS
unless the kernel's ipv6_accept_ra predicate (net/ipv6/addrconf.c) would
process the elicited RA. Before this, v4-only setups paid the full ~3s
solicit timeout on every CmdAdd despite the kernel ignoring the response.

GoDoc on SolicitRouterAndWaitForSLAACv6 trimmed from a 37-line RFC-cited
block to ~10 lines, matching the density of sibling exported helpers in
pkg/networking/networking/packet.go. The longer-form rationale lives in
the PR description and #5618.

integration/doc.go expanded with the gomonkey + testutils.NewNS()
OS-thread-leak rationale so reviewers see why this subpackage exists
(no precedent for `<pkg>/integration` in the repo otherwise).

Refs #5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>

* coordinator: clarify the kernel-drops-RA wording, note caller contract, test the gate

Three small follow-ups from review:

- Reword the SolicitRouterAndWaitForSLAACv6 docstring to say the kernel
  drops (not ignores) the early periodic RAs, and explicitly state the
  caller contract: accept_ra must be permissive on the iface before the
  function is invoked (typically via the upstream tuning plugin earlier
  in the NAD), otherwise the function exits without sending an RS.

- Add an integration test that exercises the gate's block path
  (accept_ra=1 + forwarding=1 → returns before Phase 1's first poll).
  Wraps sysctl writes in a Skip() guard so the test runs in CI (sudo on
  a real Linux host) but cleanly skips on contributor laptops where
  /proc/sys is read-only.

Refs #5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>

* coordinator: revert getAdders filter; drop active RS solicit per review

Two fixes rolled together because they touch adjacent code:

1. Revert the IFA_F_TENTATIVE/IFA_F_DADFAILED/IsLoopback/IsUnspecified
   filters in getAdders. The TENTATIVE filter broke setupNeighborhood's
   stale-entry cleanup — c.currentAddress now drops the pod's v6 while
   DAD is in progress, so the pre-populated stale neighbor on the host
   is never matched and the entry isn't deleted. Observable as e2e
   MacvlanOverlayOne / "auto clean up the dirty rules" timing out on
   dual / ipv6 matrix slots. The standards-review reasoning that
   "tentative is not assigned per RFC 4862 §5.4" is technically correct,
   but the rest of the coordinator codebase intentionally uses tentative
   addresses for neighbor / route bookkeeping (the pod is assigned that
   IP even if DAD hasn't completed). Filtering at the getAdders layer
   broke that contract.

2. Drop SolicitRouterAndWaitForSLAACv6 + defaultSLAACSolicitTimeout +
   the active-path call site in CmdAdd per @cyclinder's review on #5619.
   Coordinator now only does the passive iface scan when PrevResult.IPs
   lacks v6. Comment #1 (drop the prevResultFamily re-parse and the
   familyDiscovery tracking) folded in.

The active RS remains the right fix for setups hitting the
macvlan→tuning accept_ra-flip-after-link-up timing race where the kernel
drops the early periodic RAs and won't auto-resend RS. Revisitable as a
separate CNI plugin or follow-up PR after this lands.

Refs #5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>

* networking/integration: dedupe redundant DescribeTable entry

The previous "v4-only IPAM, no v6 on iface" and "v4 IPAM + only link-local
v6" entries had identical inputs (prev=[v4], ifaceV4=[v4], ifaceV6=nil)
and the same expected output (FAMILY_V4). The kernel auto-assigns fe80::
on every dummy iface at link-up regardless of what we add, so the "LL
filtered" semantic was already covered by the other entry. Collapsed
into a single entry with the explanatory description.

Refs #5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>

* coordinator: read iface for ipFamily, drop PrevResult parsing path

Per @cyclinder's review: scanning the iface in the pod netns is the
universal way to determine ipFamily — works for any IPAM and for kernel
SLAAC alike, simpler than the parse-PrevResult-then-fallback machinery
this PR previously added.

  - Add GetIPFamilyByIface(netns, ifName) in pkg/networking/networking/ip.go.
  - Remove GetIPFamilyByResultWithIface and the fallback branches it had.
  - command_add.go calls the new helper directly; PrevResult parsing in
    CmdAdd is now just a shape-validation pass.
  - GetIPFamilyByResult (pre-existing pure parser) is untouched for
    backward compat.

Unit tests for GetIPFamilyByIface narrowed to input validation (nil
netns, empty ifName) since the behavioral coverage (v4 / v6 / dual /
no-addrs / missing-iface) is exercised against real netlink in the
integration suite; gomonkey-based behavioral mocking proved flaky.

Refs #5618

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>

---------

Signed-off-by: Kevin Park <kevin.park1217@gmail.com>

* Add IaaS network provider integration support (#5573)

* Add IaaS network provider integration support

* add IaaS provider configuration parameters to values.yaml and README
* mount IaaS TLS secret to controller and agent pods
* add IaaS provider config to configmap
* implement IaaS client with mTLS support for allocate/release IP operations
* add IaaS config validation in controller and agent daemon startup
* integrate IaaS client into IPAM workflow

Signed-off-by: Cyclinder Kuo <qifeng.guo@daocloud.io>

* Add vlan-cni plugin support to spiderpool-plugins image

* add VLAN_VERSION build argument and environment variable
* clone vlan-cni repository and build vlan binary
* copy vlan binary to release image at /usr/plugins/vlan
* add --install-vlan flag to entrypoint.sh for conditional installation
* add vlan plugin installation logic with version logging
* export VLAN_VERSION in GitHub Actions workflow outputs
* set default VLAN_VERSION to 0.0.1 in version.sh

Signed-off-by: Cyclinder Kuo <qifeng.guo@daocloud.io>

* fix: move link setup from CmdAdd to DetectIPConflictAndGatewayReachable

* move netlink link setup logic from command_add.go to ipam_detection.go
* set link up inside DetectIPConflictAndGatewayReachable before detection
* change early return to continue when IP version is nil during detection
* remove netlink import from command_add.go

Signed-off-by: Cyclinder Kuo <qifeng.guo@daocloud.io>

---------

Signed-off-by: Cyclinder Kuo <qifeng.guo@daocloud.io>

* bump version to v1.2.0

---------

Signed-off-by: weizhoublue <weizhou.lan@daocloud.io>
Signed-off-by: Kevin Park <kevin.park1217@gmail.com>
Signed-off-by: Cyclinder Kuo <qifeng.guo@daocloud.io>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Kevin Park <kevin.park1217@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release/feature-changed release note for changed feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants