Skip to content

SR-OS ECMP: device ecmp 1 (single path) vs Batfish all-equal-cost IGP paths #200

@dhalperi

Description

@dhalperi

SR-OS ECMP semantics: device ecmp 1 (single best path) vs Batfish (all equal-cost IGP paths)

SR OS installs a single best path per prefix by default (ecmp 1), so its
route-table reports one next-hop even when several equal-cost IGP paths exist.
Batfish has no per-IGP ECMP limit and always installs every equal-cost path
(PsThenLoadBalance: "Batfish models all ECMP paths"; OSPF/IS-IS have no VI ECMP
knob). For an equidistant prefix Batfish therefore holds the device's chosen
next-hop plus extra equal-cost legs, and test_main_rib_routes flags the
surplus legs as Batfish-only routes.

Seen in the sros_services lab: 10.10.10.20/32 (and symmetric prefixes) on
p1/p2/pe2/pe4 — the device installs one OSPF next-hop, Batfish installs
2–4 equal-cost legs. These nodes' main-RIB tests are sickbay'd to this issue.

Why not a validator-side workaround

An earlier attempt forgave the surplus legs in SrosValidator whenever the
device installed a single next-hop. That globally weakens the cost matcher for
every SR-OS lab and masks a real failure mode — "device has 1 path, Batfish
computes the right one plus wrong extras" (e.g. a metric miscomputation that
creates a spurious tie) would pass. Reverted; the mismatch is sickbay'd per-lab
instead so the matcher stays strict everywhere.

Options to actually close this

  1. Model SR-OS ecmp in Batfish. Batfish ECMP is effectively binary (1 vs
    infinite). If a VI knob limits IGP to a single best path (deterministic
    tiebreak), SR-OS ecmp 1 could convert to it and device/Batfish would agree.
    Needs Batfish-side support (filed companion: batfish/batfish — IGP ECMP limit).
  2. Make ECMP labs deterministic. Per the lab-design guidance
    (infra/README.md "Determinism"), give the underlay a single genuine best
    path (asymmetric IGP metrics) so device and Batfish agree at one path with no
    tolerance needed.
  3. Validate the device next-hop is a subset of Batfish's legs as an explicit,
    opt-in comparison mode (not the default strict matcher), if we decide ECMP
    over-approximation is acceptable for some labs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions