Skip to content

Conversation

@rjarry
Copy link
Collaborator

@rjarry rjarry commented Nov 6, 2025

Implement Link Aggregation Control Protocol (802.3ad/802.1AX) for bond interfaces. LACP provides dynamic link aggregation with automatic member discovery, synchronization, and failover based on protocol negotiation rather than simple link state monitoring.

Add a new GR_BOND_MODE_LACP operational mode alongside the existing active-backup mode. LACP mode includes three load balancing algorithms for distributing egress traffic across active members:

  • rss: Reuse hardware RSS hash when available for zero-cost balancing
  • l2: Hash based on destination MAC address and VLAN tag
  • l3+l4: Hash based on IP addresses and TCP/UDP port numbers

The control plane implementation handles LACP protocol state machines including receiving and validating LACP PDUs, tracking partner state, managing synchronization and collecting/distributing flags, and handling fast/slow periodic timers and timeout detection. A periodic timer runs every second to send LACP PDUs and detect partner timeouts based on the negotiated timeout intervals.

NB: port numbers in LACP PDUs are 1-based rather than 0-based to ensure interoperability with switches that reject port ID zero as invalid.

Members transition to active state when LACP negotiation succeeds and both sides report synchronized state.

Active members are added to a 256-entry redirection table that maps hash values to member indices. The table is populated by distributing active member IDs in round-robin fashion across all entries, ensuring even load distribution while maintaining flow consistency.

For example, with 3 active members, entry 0 maps to member 0, entry 1 to member 1, entry 2 to member 2, entry 3 back to member 0, and so on. Traffic is directed by computing a hash from packet headers, taking modulo 256 to get a table index, and using the stored member ID at that position. This provides deterministic per-flow member selection with minimal disruption when members are added or removed.

The bond interface transitions to running state when at least one member becomes active.

Summary by CodeRabbit

  • New Features

    • Added LACP (802.3ad) bond mode with selectable balancing algorithms (rss, l2, l3+l4) and per‑packet load distribution.
    • Dataplane/control: full LACP handling for sending/receiving PDUs and active member management.
    • Packet tracing: detailed LACP PDU state shown in traces.
  • CLI

    • Configure LACP mode and balancing algorithm with validation (mode-specific options).
  • Tests

    • Added a smoke test validating LACP bond setup, member sync, and connectivity.

@coderabbitai
Copy link

coderabbitai bot commented Nov 6, 2025

📝 Walkthrough

Walkthrough

Adds 802.3ad LACP support across API, control, datapath, CLI, build, and tests. API: new bond mode GR_BOND_MODE_LACP, bond algorithm enum (GR_BOND_ALGO_RSS, GR_BOND_ALGO_L2, GR_BOND_ALGO_L3_L4), GR_BOND_SET_ALGO, and iface_info_bond.algo. Control: LACP module with timers, lacp_input_cb, per-member LACP state, bond_update_active_members made public, and reconfig handling for algorithm. Datapath: lacp_input/lacp_output nodes, LACP PDU types/formatting, RSS/hash TX selection, and trace_lacp_format. CLI: parse/display of LACP and algorithm. Meson updates and a smoke test added.

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.71% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'bond: add lacp support' clearly and specifically summarizes the main change: adding LACP (Link Aggregation Control Protocol) support to bond interfaces.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between afef63a and 5d8ef79.

📒 Files selected for processing (16)
  • modules/infra/api/gr_infra.h (3 hunks)
  • modules/infra/cli/bond.c (6 hunks)
  • modules/infra/control/bond.c (8 hunks)
  • modules/infra/control/gr_bond.h (3 hunks)
  • modules/infra/control/lacp.c (1 hunks)
  • modules/infra/control/meson.build (1 hunks)
  • modules/infra/datapath/bond_output.c (3 hunks)
  • modules/infra/datapath/gr_eth.h (1 hunks)
  • modules/infra/datapath/gr_lacp.h (1 hunks)
  • modules/infra/datapath/gr_trace.h (2 hunks)
  • modules/infra/datapath/lacp_input.c (1 hunks)
  • modules/infra/datapath/lacp_output.c (1 hunks)
  • modules/infra/datapath/meson.build (1 hunks)
  • modules/infra/datapath/port_rx.c (2 hunks)
  • modules/infra/datapath/trace.c (3 hunks)
  • smoke/bond_lacp_test.sh (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{c,h}

⚙️ CodeRabbit configuration file

**/*.{c,h}: - gr_vec_*() functions cannot fail. No need to check their return value.

  • gr_vec_free(x) always sets x = NULL. There is no risk of double free.
  • ec_node_*() functions consume their ec_node arguments. No leaks on error.
  • rte_node->ctx is an uint8_t array of size 16, not a pointer.
  • Don't suggest to replace assert() with graceful error checking.
  • We compile with -std=gnu2x. Unnamed parameters in function definitions are valid.

Files:

  • modules/infra/datapath/gr_eth.h
  • modules/infra/datapath/lacp_output.c
  • modules/infra/datapath/bond_output.c
  • modules/infra/control/lacp.c
  • modules/infra/datapath/gr_trace.h
  • modules/infra/api/gr_infra.h
  • modules/infra/control/gr_bond.h
  • modules/infra/control/bond.c
  • modules/infra/datapath/trace.c
  • modules/infra/cli/bond.c
  • modules/infra/datapath/lacp_input.c
  • modules/infra/datapath/port_rx.c
  • modules/infra/datapath/gr_lacp.h
**/*.sh

⚙️ CodeRabbit configuration file

**/*.sh: - Don't bother about unquoted shell variables.

Files:

  • smoke/bond_lacp_test.sh
🧠 Learnings (6)
📚 Learning: 2025-09-05T07:06:51.554Z
Learnt from: rjarry
Repo: DPDK/grout PR: 294
File: modules/policy/cli/meson.build:4-8
Timestamp: 2025-09-05T07:06:51.554Z
Learning: In this project, cli_src is a global variable that gets populated by individual modules (like modules/policy/cli/meson.build) and is ultimately consumed by an executable() call in the top-level meson.build file. This creates a modular CLI build where each module contributes its CLI sources to the main CLI binary.

Applied to files:

  • modules/infra/control/meson.build
📚 Learning: 2025-11-05T14:28:28.749Z
Learnt from: rjarry
Repo: DPDK/grout PR: 372
File: modules/infra/datapath/loop_input.c:53-56
Timestamp: 2025-11-05T14:28:28.749Z
Learning: In modules/infra/datapath/loop_input.c, the l3_edges array is intentionally indexed using network byte order (rte_be16_t) for EtherType values. Both loopback_input_add_type() registration and loopback_input_process() lookup use network byte order consistently, so no byte order conversion is needed when accessing l3_edges.

Applied to files:

  • modules/infra/datapath/gr_eth.h
  • modules/infra/datapath/lacp_output.c
  • modules/infra/datapath/bond_output.c
  • modules/infra/datapath/meson.build
  • modules/infra/datapath/gr_trace.h
  • modules/infra/datapath/trace.c
  • modules/infra/datapath/lacp_input.c
  • modules/infra/datapath/port_rx.c
  • modules/infra/datapath/gr_lacp.h
📚 Learning: 2025-11-05T13:55:26.169Z
Learnt from: maxime-leroy
Repo: DPDK/grout PR: 372
File: smoke/cross_vrf_forward_test.sh:18-18
Timestamp: 2025-11-05T13:55:26.169Z
Learning: In the DPDK/grout codebase, VRF interfaces (named gr-vrf<id>) are automatically created when an interface is added to a non-existing VRF using port_add. The VRF creation is handled automatically by the event system in vrf_netlink.c, so no explicit VRF interface creation commands are needed in test scripts.

Applied to files:

  • modules/infra/datapath/gr_eth.h
  • modules/infra/datapath/bond_output.c
  • modules/infra/datapath/gr_trace.h
  • modules/infra/control/gr_bond.h
📚 Learning: 2025-11-03T13:28:19.470Z
Learnt from: rjarry
Repo: DPDK/grout PR: 379
File: modules/infra/datapath/port_tx.c:36-46
Timestamp: 2025-11-03T13:28:19.470Z
Learning: In DPDK graph node process callbacks, the return value is only used for statistics and does not affect packet flow or scheduling through the graph. Nodes can return 0 when they haven't processed packets (e.g., when dropping or redirecting due to port state).

Applied to files:

  • modules/infra/datapath/lacp_output.c
  • modules/infra/datapath/lacp_input.c
  • modules/infra/datapath/port_rx.c
📚 Learning: 2025-10-21T15:42:43.874Z
Learnt from: rjarry
Repo: DPDK/grout PR: 350
File: modules/ip/control/address.c:214-216
Timestamp: 2025-10-21T15:42:43.874Z
Learning: In C code compiled with `-std=gnu2x`, the gr_vec_foreach macro supports inline variable declarations (e.g., `gr_vec_foreach (struct nexthop *nh, vector)`). This is valid C2x syntax and does not require pre-declaring the loop variable.

Applied to files:

  • modules/infra/control/gr_bond.h
  • modules/infra/datapath/trace.c
📚 Learning: 2025-10-08T21:22:45.922Z
Learnt from: rjarry
Repo: DPDK/grout PR: 334
File: modules/infra/control/worker.c:278-281
Timestamp: 2025-10-08T21:22:45.922Z
Learning: In the codebase, `gr_vec_add` is a macro that does not return any value and cannot fail. Do not suggest checking its return value or adding error handling around it.

Applied to files:

  • modules/infra/control/gr_bond.h
🧬 Code graph analysis (11)
modules/infra/datapath/gr_eth.h (1)
modules/infra/datapath/lacp_output.c (1)
  • lacp_send_pdu (30-42)
modules/infra/datapath/lacp_output.c (6)
modules/infra/control/gr_iface.h (1)
  • iface (16-22)
api/gr_errno.h (1)
  • errno_set (9-12)
modules/infra/datapath/control_input.c (2)
  • post_to_stack (45-54)
  • gr_control_input_register_handler (36-43)
modules/infra/datapath/gr_mbuf.h (1)
  • gr_mbuf_is_traced (52-54)
modules/infra/datapath/trace.c (2)
  • gr_mbuf_trace_add (592-619)
  • trace_lacp_format (128-151)
modules/infra/control/gr_graph.h (1)
  • rte_node_enqueue_x1 (22-25)
smoke/bond_lacp_test.sh (1)
smoke/_init.sh (2)
  • fail (102-105)
  • netns_add (111-119)
modules/infra/control/lacp.c (3)
api/gr_clock.h (1)
  • gr_clock_us (17-20)
modules/infra/control/bond.c (1)
  • bond_update_active_members (245-333)
modules/infra/datapath/lacp_output.c (1)
  • lacp_send_pdu (30-42)
modules/infra/datapath/gr_trace.h (1)
modules/infra/datapath/trace.c (1)
  • trace_lacp_format (128-151)
modules/infra/control/gr_bond.h (1)
modules/infra/control/bond.c (1)
  • bond_update_active_members (245-333)
modules/infra/control/bond.c (2)
modules/infra/control/iface.c (2)
  • iface_add_eth_addr (310-323)
  • iface_del_eth_addr (325-338)
main/gr_vec.h (1)
  • gr_vec_len (47-51)
modules/infra/cli/bond.c (3)
modules/infra/api/gr_infra.h (1)
  • gr_bond_algo_name (145-155)
api/gr_errno.h (1)
  • errno_set (9-12)
cli/ecoli.c (1)
  • arg_str (99-109)
modules/infra/datapath/lacp_input.c (5)
modules/infra/control/lacp.c (1)
  • lacp_input_cb (29-97)
modules/infra/datapath/gr_mbuf.h (1)
  • gr_mbuf_is_traced (52-54)
modules/infra/datapath/trace.c (2)
  • gr_mbuf_trace_add (592-619)
  • trace_lacp_format (128-151)
modules/infra/control/gr_graph.h (1)
  • rte_node_enqueue_x1 (22-25)
modules/infra/datapath/eth_input.c (1)
  • gr_eth_input_add_type (27-33)
modules/infra/datapath/port_rx.c (1)
modules/infra/control/gr_iface.h (1)
  • iface (16-22)
modules/infra/datapath/gr_lacp.h (1)
modules/infra/control/lacp.c (1)
  • lacp_input_cb (29-97)
🪛 Shellcheck (0.11.0)
smoke/bond_lacp_test.sh

[warning] 23-23: Quote this to prevent word splitting.

(SC2046)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
  • GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
  • GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
  • GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
  • GitHub Check: build-cross-aarch64
  • GitHub Check: deb
  • GitHub Check: rpm
🔇 Additional comments (7)
modules/infra/datapath/gr_lacp.h (1)

1-104: LGTM! Well-structured LACP protocol definitions.

The LACP structures, enums, and constants accurately match the 802.3ad specification. Network byte order types are used appropriately, and the PDU layout is correct (110 bytes including padding).

modules/infra/datapath/meson.build (1)

12-13: LGTM!

Build file correctly includes the new LACP datapath modules.

modules/infra/datapath/gr_trace.h (1)

8-8: LGTM!

Header correctly includes gr_lacp.h for the LACP structures and declares the trace formatter consistently with existing formatters.

Also applies to: 44-44

modules/infra/datapath/trace.c (2)

128-151: LGTM! Clean LACP state formatting.

The implementation correctly formats all LACP state flags with proper space separation and error handling consistent with other trace formatters.


557-563: LGTM!

LACP tracing integration follows the established pattern for other protocols and correctly parses and formats LACP PDUs.

modules/infra/control/bond.c (2)

273-333: LGTM: LACP member selection and redirection logic.

The LACP mode implementation correctly:

  • Uses 1-based port numbers for interoperability (line 281)
  • Initializes LACP state on first use (line 288)
  • Collects active members based on link state and LACP negotiation (lines 302-309)
  • Populates the redirection table round-robin across active members (lines 315-317)
  • Manages interface RUNNING state and events appropriately (lines 318-329)

347-348: LGTM: Algo field handling.

The load-balancing algorithm configuration is correctly handled:

  • Defaults to RSS when not specified (line 348)
  • Properly exposed in the API (line 413)
  • LACP member active status correctly reported (lines 423-425)

Also applies to: 413-413, 423-425

Add LACP protocol definitions and packet trace formatting to support
debugging and monitoring of link aggregation.

Define the LACP PDU wire format according to IEEE 802.3ad including the
slow protocol subtype, TLV-based actor/partner/collector information,
and terminator. The lacp_participant structure contains system priority,
MAC address, aggregation key, port priority, port number, and state
flags. All multi-byte fields use network byte order with proper packing
and alignment attributes to match the on-wire format.

LACP state flags encode the port's operational state in an 8-bit field:
ACTIVE indicates spontaneous PDU transmission, FAST selects 1-second
intervals versus 30-second slow mode, AGGREGATABLE marks the port as
capable of aggregation, SYNCHRONIZED indicates matching configuration
with the partner, COLLECTING and DISTRIBUTING show the mux machine
state, while DEFAULTED and EXPIRED track timeout conditions.

Extend packet tracing to decode LACP PDUs transmitted on the slow
protocols ethertype. The trace formatter displays actor state flags as
human-readable strings such as "active fast aggreg sync collect distrib"
rather than raw hex values, simplifying protocol troubleshooting.

Signed-off-by: Robin Jarry <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (8)
modules/infra/control/lacp.c (2)

76-85: Gate activation on partner sync

We still flip member->active and local SYNCHRONIZED/COLLECTING bits before confirming the partner advertises SYNCHRONIZED+COLLECTING, so traffic can be steered onto a link whose peer will drop it. Delay the local state update and member->active until partner sync/collect flags are set.

-	member->active = true;
-	member->local.state |= LACP_STATE_SYNCHRONIZED;
-	member->local.state &= ~(LACP_STATE_EXPIRED | LACP_STATE_DEFAULTED);
-	if (member->remote.state & LACP_STATE_SYNCHRONIZED) {
-		member->local.state |= LACP_STATE_COLLECTING;
-		if (member->remote.state & LACP_STATE_COLLECTING)
-			member->local.state |= LACP_STATE_DISTRIBUTING;
-		else
-			member->local.state &= ~LACP_STATE_DISTRIBUTING;
-	}
+	bool partner_sync = (member->remote.state & LACP_STATE_SYNCHRONIZED) != 0;
+	bool partner_collect = (member->remote.state & LACP_STATE_COLLECTING) != 0;
+
+	member->local.state &=
+		~(LACP_STATE_SYNCHRONIZED | LACP_STATE_COLLECTING | LACP_STATE_DISTRIBUTING
+			| LACP_STATE_EXPIRED | LACP_STATE_DEFAULTED);
+	if (partner_sync) {
+		member->local.state |= LACP_STATE_SYNCHRONIZED | LACP_STATE_COLLECTING;
+		if (partner_collect)
+			member->local.state |= LACP_STATE_DISTRIBUTING;
+	}
+
+	member->active = partner_sync && partner_collect;

171-178: Keep retry flag when lacp_send() fails

If the send path returns an error we clear need_to_transmit and schedule the next interval, so no more PDUs go out and the partner ages us out. Leave the retry flag set (and skip rescheduling) when lacp_send() fails so the next tick retries.

-		if (lacp_send(member) < 0)
-			LOG(ERR, "lacp_send: %s", strerror(errno));
-
-		member->need_to_transmit = false;
-		if (member->remote.state & LACP_STATE_FAST)
+		if (lacp_send(member) < 0) {
+			LOG(ERR, "lacp_send: %s", strerror(errno));
+			member->need_to_transmit = true;
+			continue;
+		}
+
+		member->need_to_transmit = false;
+		if (member->remote.state & LACP_STATE_FAST)
modules/infra/datapath/gr_eth.h (1)

41-42: Include the LACP PDU definition

struct lacp_pdu isn’t visible here, so this header won’t compile. Pull in gr_lacp.h (or add a forward decl) before the prototype.

 #include <gr_graph.h>
 #include <gr_iface.h>
 #include <gr_mbuf.h>
+#include <gr_lacp.h>
modules/infra/cli/bond.c (1)

137-145: Require LACP mode before accepting ALGO

We still reject balance on LACP bonds and allow it on active-backup, blocking the new mode entirely. Flip the guard so only LACP accepts an algorithm.

-		if (bond->mode != GR_BOND_MODE_ACTIVE_BACKUP) {
+		if (bond->mode != GR_BOND_MODE_LACP) {
 			errno = EPROTOTYPE;
 			goto err;
 		}
modules/infra/datapath/lacp_input.c (1)

100-103: Guard trace from short PDUs

Frames that fail the length check still drop into this block via goto next, so *lacp reads beyond the captured payload. Please gate the trace on the size condition before copying.

Apply this diff:

-		if (gr_mbuf_is_traced(mbuf)) {
+		if (gr_mbuf_is_traced(mbuf) && rte_pktmbuf_pkt_len(mbuf) >= sizeof(*lacp)) {
 			struct lacp_pdu *t = gr_mbuf_trace_add(mbuf, node, sizeof(*t));
 			*t = *lacp;
 		}
modules/infra/control/bond.c (3)

194-195: LACP_DST_MAC not configured when switching to LACP mode without member changes.

Per previous review: bond_init_new_members is only called when GR_BOND_SET_MEMBERS is set (line 365). If the bond mode changes to GR_BOND_MODE_LACP via GR_BOND_SET_MODE alone (lines 344-345), existing members won't have LACP_DST_MAC configured, breaking LACP negotiation.

Additionally, if iface_add_eth_addr fails partway through the loop, previously configured members retain LACP_DST_MAC with no cleanup, leaving the bond in an inconsistent state.

Based on past review comments.


198-198: Critical: Index mismatch corrupts existing member state.

Loop variable i iterates over new->n_members, but bond->members[i] still contains the old member layout. Zeroing bond->members[i] here can overwrite an existing member's state at the wrong index. The array is only synchronized with the new layout at lines 370-372.

Example: if old members are [port5, port7] and new members are [port7, port9], when i=1 (new member port9), you'd zero bond->members[1] which still contains port7's state.

Based on past review comments.


232-237: Inconsistent error handling for MAC removal.

Line 232 doesn't check errno != ENOENT before logging a warning, unlike lines 219 and 226. This will produce spurious warnings when LACP_DST_MAC wasn't configured on the member.

Apply this diff for consistency:

-	if (iface_del_eth_addr(member->id, &LACP_DST_MAC) < 0) {
+	if (iface_del_eth_addr(member->id, &LACP_DST_MAC) < 0 && errno != ENOENT) {
 		LOG(WARNING,
 		    "failed to unconfigure mac address on member %s: %s",
 		    member->name,
 		    strerror(errno));
 	}

Based on past review comments.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d8ef79 and 4e104e7.

⛔ Files ignored due to path filters (1)
  • docs/graph.svg is excluded by !**/*.svg
📒 Files selected for processing (16)
  • modules/infra/api/gr_infra.h (3 hunks)
  • modules/infra/cli/bond.c (6 hunks)
  • modules/infra/control/bond.c (8 hunks)
  • modules/infra/control/gr_bond.h (3 hunks)
  • modules/infra/control/lacp.c (1 hunks)
  • modules/infra/control/meson.build (1 hunks)
  • modules/infra/datapath/bond_output.c (3 hunks)
  • modules/infra/datapath/gr_eth.h (1 hunks)
  • modules/infra/datapath/gr_lacp.h (1 hunks)
  • modules/infra/datapath/gr_trace.h (2 hunks)
  • modules/infra/datapath/lacp_input.c (1 hunks)
  • modules/infra/datapath/lacp_output.c (1 hunks)
  • modules/infra/datapath/meson.build (1 hunks)
  • modules/infra/datapath/port_rx.c (2 hunks)
  • modules/infra/datapath/trace.c (3 hunks)
  • smoke/bond_lacp_test.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • modules/infra/control/meson.build
  • modules/infra/control/gr_bond.h
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{c,h}

⚙️ CodeRabbit configuration file

**/*.{c,h}: - gr_vec_*() functions cannot fail. No need to check their return value.

  • gr_vec_free(x) always sets x = NULL. There is no risk of double free.
  • ec_node_*() functions consume their ec_node arguments. No leaks on error.
  • rte_node->ctx is an uint8_t array of size 16, not a pointer.
  • Don't suggest to replace assert() with graceful error checking.
  • We compile with -std=gnu2x. Unnamed parameters in function definitions are valid.

Files:

  • modules/infra/datapath/gr_eth.h
  • modules/infra/datapath/gr_trace.h
  • modules/infra/datapath/port_rx.c
  • modules/infra/datapath/lacp_output.c
  • modules/infra/datapath/bond_output.c
  • modules/infra/api/gr_infra.h
  • modules/infra/control/lacp.c
  • modules/infra/control/bond.c
  • modules/infra/datapath/lacp_input.c
  • modules/infra/datapath/trace.c
  • modules/infra/cli/bond.c
  • modules/infra/datapath/gr_lacp.h
**/*.sh

⚙️ CodeRabbit configuration file

**/*.sh: - Don't bother about unquoted shell variables.

Files:

  • smoke/bond_lacp_test.sh
🧠 Learnings (4)
📚 Learning: 2025-11-05T14:28:28.749Z
Learnt from: rjarry
Repo: DPDK/grout PR: 372
File: modules/infra/datapath/loop_input.c:53-56
Timestamp: 2025-11-05T14:28:28.749Z
Learning: In modules/infra/datapath/loop_input.c, the l3_edges array is intentionally indexed using network byte order (rte_be16_t) for EtherType values. Both loopback_input_add_type() registration and loopback_input_process() lookup use network byte order consistently, so no byte order conversion is needed when accessing l3_edges.

Applied to files:

  • modules/infra/datapath/gr_eth.h
  • modules/infra/datapath/gr_trace.h
  • modules/infra/datapath/port_rx.c
  • modules/infra/datapath/lacp_output.c
  • modules/infra/datapath/bond_output.c
  • modules/infra/datapath/lacp_input.c
  • modules/infra/datapath/meson.build
  • modules/infra/datapath/trace.c
  • modules/infra/datapath/gr_lacp.h
📚 Learning: 2025-11-05T13:55:26.169Z
Learnt from: maxime-leroy
Repo: DPDK/grout PR: 372
File: smoke/cross_vrf_forward_test.sh:18-18
Timestamp: 2025-11-05T13:55:26.169Z
Learning: In the DPDK/grout codebase, VRF interfaces (named gr-vrf<id>) are automatically created when an interface is added to a non-existing VRF using port_add. The VRF creation is handled automatically by the event system in vrf_netlink.c, so no explicit VRF interface creation commands are needed in test scripts.

Applied to files:

  • modules/infra/datapath/gr_eth.h
  • modules/infra/datapath/gr_trace.h
  • modules/infra/datapath/bond_output.c
📚 Learning: 2025-11-03T13:28:19.470Z
Learnt from: rjarry
Repo: DPDK/grout PR: 379
File: modules/infra/datapath/port_tx.c:36-46
Timestamp: 2025-11-03T13:28:19.470Z
Learning: In DPDK graph node process callbacks, the return value is only used for statistics and does not affect packet flow or scheduling through the graph. Nodes can return 0 when they haven't processed packets (e.g., when dropping or redirecting due to port state).

Applied to files:

  • modules/infra/datapath/port_rx.c
  • modules/infra/datapath/lacp_output.c
  • modules/infra/datapath/lacp_input.c
📚 Learning: 2025-10-21T15:42:43.874Z
Learnt from: rjarry
Repo: DPDK/grout PR: 350
File: modules/ip/control/address.c:214-216
Timestamp: 2025-10-21T15:42:43.874Z
Learning: In C code compiled with `-std=gnu2x`, the gr_vec_foreach macro supports inline variable declarations (e.g., `gr_vec_foreach (struct nexthop *nh, vector)`). This is valid C2x syntax and does not require pre-declaring the loop variable.

Applied to files:

  • modules/infra/datapath/trace.c
🧬 Code graph analysis (10)
modules/infra/datapath/gr_eth.h (1)
modules/infra/datapath/lacp_output.c (1)
  • lacp_send_pdu (30-42)
modules/infra/datapath/gr_trace.h (1)
modules/infra/datapath/trace.c (1)
  • trace_lacp_format (128-151)
modules/infra/datapath/port_rx.c (1)
modules/infra/control/gr_iface.h (1)
  • iface (16-22)
modules/infra/datapath/lacp_output.c (6)
modules/infra/control/gr_iface.h (1)
  • iface (16-22)
api/gr_errno.h (1)
  • errno_set (9-12)
modules/infra/datapath/control_input.c (2)
  • post_to_stack (45-54)
  • gr_control_input_register_handler (36-43)
modules/infra/datapath/gr_mbuf.h (1)
  • gr_mbuf_is_traced (52-54)
modules/infra/datapath/trace.c (2)
  • gr_mbuf_trace_add (592-619)
  • trace_lacp_format (128-151)
modules/infra/control/gr_graph.h (1)
  • rte_node_enqueue_x1 (22-25)
modules/infra/control/lacp.c (3)
api/gr_clock.h (1)
  • gr_clock_us (17-20)
modules/infra/control/bond.c (1)
  • bond_update_active_members (245-333)
modules/infra/datapath/lacp_output.c (1)
  • lacp_send_pdu (30-42)
modules/infra/control/bond.c (3)
modules/infra/control/iface.c (2)
  • iface_add_eth_addr (310-323)
  • iface_del_eth_addr (325-338)
modules/infra/control/gr_iface.h (1)
  • iface (16-22)
main/gr_vec.h (1)
  • gr_vec_len (47-51)
modules/infra/datapath/lacp_input.c (5)
modules/infra/control/lacp.c (1)
  • lacp_input_cb (29-97)
modules/infra/datapath/gr_mbuf.h (1)
  • gr_mbuf_is_traced (52-54)
modules/infra/datapath/trace.c (2)
  • gr_mbuf_trace_add (592-619)
  • trace_lacp_format (128-151)
modules/infra/control/gr_graph.h (1)
  • rte_node_enqueue_x1 (22-25)
modules/infra/datapath/eth_input.c (1)
  • gr_eth_input_add_type (27-33)
smoke/bond_lacp_test.sh (1)
smoke/_init.sh (2)
  • fail (102-105)
  • netns_add (111-119)
modules/infra/cli/bond.c (3)
modules/infra/api/gr_infra.h (1)
  • gr_bond_algo_name (145-155)
api/gr_errno.h (1)
  • errno_set (9-12)
cli/ecoli.c (1)
  • arg_str (99-109)
modules/infra/datapath/gr_lacp.h (1)
modules/infra/control/lacp.c (1)
  • lacp_input_cb (29-97)
🪛 Shellcheck (0.11.0)
smoke/bond_lacp_test.sh

[warning] 23-23: Quote this to prevent word splitting.

(SC2046)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
  • GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
  • GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
  • GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
  • GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
  • GitHub Check: build-cross-aarch64
  • GitHub Check: deb
  • GitHub Check: rpm
🔇 Additional comments (5)
modules/infra/datapath/port_rx.c (1)

42-76: LGTM! Correct routing of LACP control frames to physical interfaces.

The logic correctly distinguishes SLOW protocol frames (LACP) from regular traffic. LACP control frames are routed to the physical member interface (ctx->iface) for per-member processing, while data frames go to the bond interface. Byte order handling is consistent with network byte order used throughout the codebase.

modules/infra/control/bond.c (4)

245-333: LGTM: LACP mode and redirection table logic.

Function promoted to public for external calls. LACP mode correctly:

  • Uses 1-based port numbers (line 281) for interoperability
  • Initializes local LACP fields with bond configuration (lines 282-287)
  • Resets state only when last_rx == 0 (lines 288-299)
  • Adds active members meeting link/sync criteria (lines 302-309)

Redirection table (lines 315-317) distributes active member IDs round-robin across 256 entries, ensuring even distribution per PR objectives. RUNNING state management (lines 318-329) correctly pushes status events when active member count transitions.


347-348: LGTM: Algorithm reconfiguration.

Correctly handles GR_BOND_SET_ALGO with sensible default to GR_BOND_ALGO_RSS.


413-413: LGTM: Algorithm exposed in public API.

Correctly exposes bond->algo in the public API surface.


423-425: LGTM: LACP member active status.

For LACP mode, member active flag correctly reflects bond->members[i].active (set by LACP state machine), unlike active-backup which has a single active member.

Implement Link Aggregation Control Protocol (802.3ad/802.1AX) for bond
interfaces. LACP provides dynamic link aggregation with automatic member
discovery, synchronization, and failover based on protocol negotiation
rather than simple link state monitoring.

Add a new GR_BOND_MODE_LACP operational mode alongside the existing
active-backup mode. LACP mode includes three load balancing algorithms
for distributing egress traffic across active members:

- rss: Reuse hardware RSS hash when available for zero-cost balancing
- l2: Hash based on destination MAC address and VLAN tag
- l3+l4: Hash based on IP addresses and TCP/UDP port numbers

The control plane implementation handles LACP protocol state machines
including receiving and validating LACP PDUs, tracking partner state,
managing synchronization and collecting/distributing flags, and handling
fast/slow periodic timers and timeout detection. A periodic timer runs
every second to send LACP PDUs and detect partner timeouts based on the
negotiated timeout intervals.

NB: port numbers in LACP PDUs are 1-based rather than 0-based to ensure
interoperability with switches that reject port ID zero as invalid.

Members transition to active state when LACP negotiation succeeds and
both sides report synchronized state.

Active members are added to a 256-entry redirection table that maps hash
values to member indices. The table is populated by distributing active
member IDs in round-robin fashion across all entries, ensuring even load
distribution while maintaining flow consistency.

For example, with 3 active members, entry 0 maps to member 0, entry 1 to
member 1, entry 2 to member 2, entry 3 back to member 0, and so on.
Traffic is directed by computing a hash from packet headers, taking
modulo 256 to get a table index, and using the stored member ID at that
position. This provides deterministic per-flow member selection with
minimal disruption when members are added or removed.

The bond interface transitions to running state when at least one member
becomes active.

Signed-off-by: Robin Jarry <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
modules/infra/datapath/lacp_input.c (1)

66-69: Guard trace against runt PDUs.

When validation fails at line 44 and edge is set to INVALID_PDU, execution continues to the trace block. Dereferencing *lacp on a runt packet reads past the mbuf, causing out-of-bounds access.

Apply this diff:

-		if (gr_mbuf_is_traced(mbuf)) {
+		if (gr_mbuf_is_traced(mbuf) && edge == TO_CONTROL) {
 			struct lacp_pdu *t = gr_mbuf_trace_add(mbuf, node, sizeof(*t));
 			*t = *lacp;
 		}
modules/infra/datapath/bond_output.c (1)

43-166: Fix tuple storage to prevent undefined behavior.

tuple.u32 is a single 4-byte word, but rte_softrss_be(&tuple.u32, len / sizeof(uint32_t), ...) is called with len up to 40 bytes (IPv6 case), reading far past the object boundary. Additionally, the union is uninitialized, leaving padding bytes with garbage values that cause non-deterministic hashing.

Apply this diff:

 	union {
-		uint32_t u32;
+		uint32_t u32[sizeof(struct rte_ipv6_tuple) / sizeof(uint32_t)];
 		struct {
 			struct rte_ether_addr mac;
 			rte_be16_t vlan_id;
 		} l2;
 		struct rte_ipv4_tuple v4;
 		struct rte_ipv6_tuple v6;
-	} tuple;
+	} tuple = {0};
...
-	hash = rte_softrss_be(&tuple.u32, len / sizeof(uint32_t), rss_key);
+	hash = rte_softrss_be(tuple.u32, len / sizeof(uint32_t), rss_key);
modules/infra/control/bond.c (2)

194-195: LACP_DST_MAC not configured when switching modes.

bond_init_new_members (which adds LACP_DST_MAC) is only called when GR_BOND_SET_MEMBERS is set (line 365). If a bond's mode changes from ACTIVE_BACKUP to LACP via GR_BOND_SET_MODE alone (lines 344-345), existing members won't receive LACP_DST_MAC, breaking LACP negotiation.

Fix: When mode changes to GR_BOND_MODE_LACP, add LACP_DST_MAC to all existing members; when leaving LACP mode, remove it.

 	if (set_attrs & GR_BOND_SET_MODE)
 		bond->mode = api->mode;
+		if (api->mode == GR_BOND_MODE_LACP) {
+			for (uint8_t i = 0; i < bond->n_members; i++) {
+				const struct iface *member = bond->members[i].iface;
+				if (iface_add_eth_addr(member->id, &LACP_DST_MAC) < 0)
+					return errno_log(errno, "iface_add_eth_addr(lacp)");
+			}
+		}

231-236: Inconsistent error handling for LACP_DST_MAC removal.

Lines 218 and 225 suppress warnings when errno == ENOENT, but this call doesn't. This produces spurious warnings when LACP_DST_MAC wasn't configured.

-	if (iface_del_eth_addr(member->id, &LACP_DST_MAC) < 0) {
+	if (iface_del_eth_addr(member->id, &LACP_DST_MAC) < 0 && errno != ENOENT) {
 		LOG(WARNING,
 		    "failed to unconfigure mac address on member %s: %s",
 		    member->name,
 		    strerror(errno));
 	}
🧹 Nitpick comments (1)
modules/infra/datapath/port_rx.c (1)

42-76: LGTM: Correct LACP packet routing.

The logic correctly routes LACP packets (SLOW protocol) to the physical interface (ctx->iface) while regular packets go to the logical interface (iface). This is necessary because LACP operates per-member. The unlikely() hint on line 72 is appropriate since LACP frames are rare compared to data traffic.

Optionally, consider adding a brief comment before line 72 explaining why SLOW protocol packets receive special treatment:

+		// LACP packets must be processed on the physical interface
 		if (unlikely(eth->ether_type == RTE_BE16(RTE_ETHER_TYPE_SLOW)))
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7054935 and af71aee.

⛔ Files ignored due to path filters (1)
  • docs/graph.svg is excluded by !**/*.svg
📒 Files selected for processing (14)
  • modules/infra/api/gr_infra.h (3 hunks)
  • modules/infra/cli/bond.c (6 hunks)
  • modules/infra/control/bond.c (8 hunks)
  • modules/infra/control/gr_bond.h (3 hunks)
  • modules/infra/control/lacp.c (1 hunks)
  • modules/infra/control/meson.build (1 hunks)
  • modules/infra/datapath/bond_output.c (3 hunks)
  • modules/infra/datapath/gr_eth.h (1 hunks)
  • modules/infra/datapath/gr_lacp.h (1 hunks)
  • modules/infra/datapath/lacp_input.c (1 hunks)
  • modules/infra/datapath/lacp_output.c (1 hunks)
  • modules/infra/datapath/meson.build (1 hunks)
  • modules/infra/datapath/port_rx.c (2 hunks)
  • smoke/bond_lacp_test.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • modules/infra/control/meson.build
  • modules/infra/cli/bond.c
  • modules/infra/datapath/gr_eth.h
  • modules/infra/control/lacp.c
  • modules/infra/datapath/meson.build
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{c,h}

⚙️ CodeRabbit configuration file

**/*.{c,h}: - gr_vec_*() functions cannot fail. No need to check their return value.

  • gr_vec_free(x) always sets x = NULL. There is no risk of double free.
  • ec_node_*() functions consume their ec_node arguments. No leaks on error.
  • rte_node->ctx is an uint8_t array of size 16, not a pointer.
  • Don't suggest to replace assert() with graceful error checking.
  • We compile with -std=gnu2x. Unnamed parameters in function definitions are valid.

Files:

  • modules/infra/control/gr_bond.h
  • modules/infra/datapath/lacp_input.c
  • modules/infra/api/gr_infra.h
  • modules/infra/datapath/bond_output.c
  • modules/infra/datapath/port_rx.c
  • modules/infra/datapath/gr_lacp.h
  • modules/infra/datapath/lacp_output.c
  • modules/infra/control/bond.c
**/*.sh

⚙️ CodeRabbit configuration file

**/*.sh: - Don't bother about unquoted shell variables.

Files:

  • smoke/bond_lacp_test.sh
🧠 Learnings (7)
📚 Learning: 2025-11-05T13:55:26.169Z
Learnt from: maxime-leroy
Repo: DPDK/grout PR: 372
File: smoke/cross_vrf_forward_test.sh:18-18
Timestamp: 2025-11-05T13:55:26.169Z
Learning: In the DPDK/grout codebase, VRF interfaces (named gr-vrf<id>) are automatically created when an interface is added to a non-existing VRF using port_add. The VRF creation is handled automatically by the event system in vrf_netlink.c, so no explicit VRF interface creation commands are needed in test scripts.

Applied to files:

  • modules/infra/control/gr_bond.h
  • modules/infra/datapath/bond_output.c
📚 Learning: 2025-10-21T15:42:43.874Z
Learnt from: rjarry
Repo: DPDK/grout PR: 350
File: modules/ip/control/address.c:214-216
Timestamp: 2025-10-21T15:42:43.874Z
Learning: In C code compiled with `-std=gnu2x`, the gr_vec_foreach macro supports inline variable declarations (e.g., `gr_vec_foreach (struct nexthop *nh, vector)`). This is valid C2x syntax and does not require pre-declaring the loop variable.

Applied to files:

  • modules/infra/control/gr_bond.h
📚 Learning: 2025-10-08T21:22:45.922Z
Learnt from: rjarry
Repo: DPDK/grout PR: 334
File: modules/infra/control/worker.c:278-281
Timestamp: 2025-10-08T21:22:45.922Z
Learning: In the codebase, `gr_vec_add` is a macro that does not return any value and cannot fail. Do not suggest checking its return value or adding error handling around it.

Applied to files:

  • modules/infra/control/gr_bond.h
📚 Learning: 2025-11-05T14:28:28.749Z
Learnt from: rjarry
Repo: DPDK/grout PR: 372
File: modules/infra/datapath/loop_input.c:53-56
Timestamp: 2025-11-05T14:28:28.749Z
Learning: In modules/infra/datapath/loop_input.c, the l3_edges array is intentionally indexed using network byte order (rte_be16_t) for EtherType values. Both loopback_input_add_type() registration and loopback_input_process() lookup use network byte order consistently, so no byte order conversion is needed when accessing l3_edges.

Applied to files:

  • modules/infra/datapath/lacp_input.c
  • modules/infra/datapath/bond_output.c
  • modules/infra/datapath/port_rx.c
  • modules/infra/datapath/gr_lacp.h
  • modules/infra/datapath/lacp_output.c
📚 Learning: 2025-11-03T13:28:19.470Z
Learnt from: rjarry
Repo: DPDK/grout PR: 379
File: modules/infra/datapath/port_tx.c:36-46
Timestamp: 2025-11-03T13:28:19.470Z
Learning: In DPDK graph node process callbacks, the return value is only used for statistics and does not affect packet flow or scheduling through the graph. Nodes can return 0 when they haven't processed packets (e.g., when dropping or redirecting due to port state).

Applied to files:

  • modules/infra/datapath/lacp_input.c
  • modules/infra/datapath/port_rx.c
  • modules/infra/datapath/lacp_output.c
📚 Learning: 2025-09-09T09:22:31.596Z
Learnt from: maxime-leroy
Repo: DPDK/grout PR: 309
File: modules/srv6/datapath/srv6_local.c:97-101
Timestamp: 2025-09-09T09:22:31.596Z
Learning: In SRv6 IPv6 extension header parsing, when is_ipv6_ext[proto] is true, rte_ipv6_get_next_ext() will not return a negative value, making error checking unnecessary. An assert can be used as a defensive measure for future DPDK compatibility.

Applied to files:

  • modules/infra/datapath/bond_output.c
📚 Learning: 2025-09-09T06:02:47.703Z
Learnt from: maxime-leroy
Repo: DPDK/grout PR: 309
File: modules/srv6/datapath/srv6_local.c:88-122
Timestamp: 2025-09-09T06:02:47.703Z
Learning: The DPDK function rte_ipv6_get_next_ext() only reads 2 bytes from the extension header, so a 2-byte precheck is sufficient before calling it.

Applied to files:

  • modules/infra/datapath/bond_output.c
🧬 Code graph analysis (7)
modules/infra/control/gr_bond.h (1)
modules/infra/control/bond.c (1)
  • bond_update_active_members (245-333)
smoke/bond_lacp_test.sh (1)
smoke/_init.sh (2)
  • fail (102-105)
  • netns_add (111-119)
modules/infra/datapath/lacp_input.c (5)
modules/infra/control/lacp.c (1)
  • lacp_input_cb (29-99)
modules/infra/datapath/gr_mbuf.h (1)
  • gr_mbuf_is_traced (52-54)
modules/infra/datapath/trace.c (2)
  • gr_mbuf_trace_add (592-619)
  • trace_lacp_format (128-151)
modules/infra/control/gr_graph.h (1)
  • rte_node_enqueue_x1 (22-25)
modules/infra/datapath/eth_input.c (1)
  • gr_eth_input_add_type (27-33)
modules/infra/datapath/port_rx.c (1)
modules/infra/control/gr_iface.h (1)
  • iface (16-22)
modules/infra/datapath/gr_lacp.h (1)
modules/infra/control/lacp.c (1)
  • lacp_input_cb (29-99)
modules/infra/datapath/lacp_output.c (6)
modules/infra/control/gr_iface.h (1)
  • iface (16-22)
api/gr_errno.h (1)
  • errno_set (9-12)
modules/infra/datapath/control_input.c (2)
  • post_to_stack (45-54)
  • gr_control_input_register_handler (36-43)
modules/infra/datapath/gr_mbuf.h (1)
  • gr_mbuf_is_traced (52-54)
modules/infra/datapath/trace.c (2)
  • gr_mbuf_trace_add (592-619)
  • trace_lacp_format (128-151)
modules/infra/control/gr_graph.h (1)
  • rte_node_enqueue_x1 (22-25)
modules/infra/control/bond.c (2)
modules/infra/control/iface.c (2)
  • iface_add_eth_addr (310-323)
  • iface_del_eth_addr (325-338)
main/gr_vec.h (1)
  • gr_vec_len (47-51)
🪛 Shellcheck (0.11.0)
smoke/bond_lacp_test.sh

[warning] 23-23: Quote this to prevent word splitting.

(SC2046)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
  • GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
  • GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
  • GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
  • GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
  • GitHub Check: build-cross-aarch64
  • GitHub Check: rpm
  • GitHub Check: deb
🔇 Additional comments (7)
modules/infra/api/gr_infra.h (1)

125-155: LGTM! Clean API extension.

The LACP mode and algorithm enums are well-structured with appropriate string conversion functions. Starting enum values at 1 avoids accidental zero-initialization bugs.

modules/infra/datapath/gr_lacp.h (1)

65-104: LGTM! Protocol structures correctly defined.

The LACP PDU and participant structures properly use packed/aligned attributes and include necessary padding for 802.3ad compliance. The standard LACP multicast MAC is correctly defined.

modules/infra/control/gr_bond.h (1)

15-41: LGTM! Well-structured LACP integration.

The extended bond_member structure includes all necessary LACP state tracking fields. The 256-entry redirection table provides good hash distribution granularity, and exposing bond_update_active_members enables the LACP control module to trigger member list updates.

modules/infra/datapath/lacp_output.c (1)

30-76: LGTM! Robust error handling.

The function properly handles allocation failures, mbuf append errors, and cleanup. The LACP PDU is correctly routed with appropriate destination MAC and EtherType.

smoke/bond_lacp_test.sh (1)

5-46: LGTM! Comprehensive LACP test.

The test properly validates LACP negotiation by waiting for member synchronization before checking connectivity. The 4-second timeout is appropriate for LACP fast mode negotiation.

Note: Static analysis warning about unquoted $(dirname $0) can be safely ignored per coding guidelines.

modules/infra/control/bond.c (2)

245-333: LGTM! Solid LACP member management.

The refactored bond_update_active_members correctly handles both modes: maintains single active member for ACTIVE_BACKUP, and for LACP initializes per-member LACP fields, determines active members based on link state and LACP negotiation, populates the redirection table, and properly manages RUNNING state transitions with appropriate events.


347-348: LGTM! Algorithm handling is correct.

The algo field is properly handled during reconfiguration (defaulting to RSS), exported via bond_to_api, and correctly reported per-member active state for LACP mode.

Also applies to: 413-413, 423-425

Copy link
Collaborator

@aharivel aharivel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall very good series. Some remarks from Claude.

case GR_BOND_ALGO_L2:
eth = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
tuple.l2.mac = eth->dst_addr;
if (eth->ether_type == RTE_BE16(RTE_ETHER_TYPE_VLAN)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No validation that the packet contains a complete VLAN header before accessing vlan->vlan_tci.
The PAYLOAD macro simply casts (eth + 1) without bounds checking.
Maybe something like:

if (eth->ether_type == RTE_BE16(RTE_ETHER_TYPE_VLAN)) {                                                                      
      if (unlikely(m->data_len < sizeof(*eth) + sizeof(*vlan))) {
          tuple.l2.vlan_id = 0;                                
      } else {                                          
          vlan = PAYLOAD(eth);                                                                                                 
          tuple.l2.vlan_id = vlan->vlan_tci;
      } 

case GR_BOND_ALGO_L3_L4:
eth = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
tuple.l2.mac = eth->dst_addr;
if (eth->ether_type == RTE_BE16(RTE_ETHER_TYPE_VLAN)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as Issue than above - no bounds validation before accessing VLAN header

}
switch (eth_type) {
case RTE_BE16(RTE_ETHER_TYPE_IPV4): {
l3.ip4 = rte_pktmbuf_mtod_offset(m, struct rte_ipv4_hdr *, l3_offset);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No validation that the packet has at least l3_offset + sizeof(struct rte_ipv4_hdr) bytes before dereferencing IPv4 header fields.
Maybe:

if (unlikely(m->data_len < l3_offset + sizeof(*l3.ip4))) {                                                                   
     len = sizeof(tuple.l2);                                                                                                                                                                                                                                  
     break;                                                   
 }             
 l3.ip4 = rte_pktmbuf_mtod_offset(m, struct rte_ipv4_hdr *, l3_offset); 

l3.ip4 = rte_pktmbuf_mtod_offset(m, struct rte_ipv4_hdr *, l3_offset);
tuple.v4.src_addr = l3.ip4->src_addr;
tuple.v4.dst_addr = l3.ip4->dst_addr;
switch (l3.ip4->next_proto_id) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem:

  1. Fragmented IP packets won't have L4 headers in non-first fragments - accessing them causes undefined behavior
  2. No bounds check that L4 header is within packet data

Maybe:

  // Check for fragmentation
  if (l3.ip4->fragment_offset & RTE_BE16(RTE_IPV4_HDR_MF_FLAG | RTE_IPV4_HDR_OFFSET_MASK)) {
      // Fragmented packet - only hash on L3
      tuple.v4.sport = 0;
      tuple.v4.dport = 0;
      len = sizeof(tuple.v4);
      break;
  }

  switch (l3.ip4->next_proto_id) {
  case IPPROTO_UDP:
      if (unlikely(m->data_len < l3_offset + rte_ipv4_hdr_len(l3.ip4) + sizeof(*l4.udp))) {
          tuple.v4.sport = 0;
          tuple.v4.dport = 0;
      } else {
          l4.udp = rte_pktmbuf_mtod_offset(...);
          tuple.v4.sport = l4.udp->src_port;
          tuple.v4.dport = l4.udp->dst_port;
      }

len = sizeof(tuple.v4);
break;
}
case RTE_BE16(RTE_ETHER_TYPE_IPV6): {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem:

  1. IPv6 extension headers are not handled - l3.ip6->proto might be a routing header or fragment header, not the final protocol
  2. No bounds checks before accessing IPv6 and L4 headers
  3. IPv6 fragments (when present in fragment extension header) can't have L4 in non-first fragments

Maybe:

  case RTE_BE16(RTE_ETHER_TYPE_IPV6): {
      if (unlikely(m->data_len < l3_offset + sizeof(*l3.ip6))) {
          len = sizeof(tuple.l2);
          break;
      }
      l3.ip6 = rte_pktmbuf_mtod_offset(m, const struct rte_ipv6_hdr *, l3_offset);
      tuple.v6.src_addr = l3.ip6->src_addr;
      tuple.v6.dst_addr = l3.ip6->dst_addr;

      // TODO: Handle IPv6 extension headers properly
      // For now, only hash on L3 if extension headers present
      uint8_t next_proto = l3.ip6->proto;
      if (next_proto != IPPROTO_UDP && next_proto != IPPROTO_TCP) {
          tuple.v6.sport = 0;
          tuple.v6.dport = 0;
          len = sizeof(tuple.v6);
          break;
      }

hash = m->hash.rss;
goto out;
}
// fallthrough
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C23 and modern compilers prefer explicit fallthrough attributes for clarity and static analysis.

__attribute__((fallthrough));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants