Skip to content

kernel: backport RDMA/core route-entry loopback detection#17511

Open
AZaugg wants to merge 1 commit into
microsoft:3.0from
AZaugg:kernel-backport-rdma-core-loopback
Open

kernel: backport RDMA/core route-entry loopback detection#17511
AZaugg wants to merge 1 commit into
microsoft:3.0from
AZaugg:kernel-backport-rdma-core-loopback

Conversation

@AZaugg
Copy link
Copy Markdown

@AZaugg AZaugg commented May 27, 2026

Merge Checklist

All boxes should be checked before merging the PR (just tick any boxes which don't apply to this PR)

  • The toolchain has been rebuilt successfully (or no changes were made to it)
  • The toolchain/worker package manifests are up-to-date
  • Any updated packages successfully build (or no packages were changed)
  • Packages depending on static components modified in this PR (Golang, *-static subpackages, etc.) have had their Release tag incremented.
  • Package tests (%check section) have been verified with RUN_CHECK=y for existing SPEC files, or added to new SPEC files
  • All package sources are available
  • cgmanifest files are up-to-date and sorted (./cgmanifest.json, ./toolkit/scripts/toolchain/cgmanifest.json, .github/workflows/cgmanifest.json)
  • LICENSE-MAP files are up-to-date (./LICENSES-AND-NOTICES/SPECS/data/licenses.json, ./LICENSES-AND-NOTICES/SPECS/LICENSES-MAP.md, ./LICENSES-AND-NOTICES/SPECS/LICENSE-EXCEPTIONS.PHOTON)
  • All source files have up-to-date hashes in the *.signatures.json files
  • sudo make go-tidy-all and sudo make go-test-coverage pass
  • Documentation has been updated to match any changes to the build system
  • Ready to merge

Summary

Backport upstream commit c31e4038c97f ("RDMA/core: Use route entry flag to decide on loopback traffic") to the 6.6.139.1 AZL3 kernel.

In multi-NIC RoCE setups where the source and destination IPs live on netdevs enslaved to a VRF, addr_resolve() picks the VRF as the next-hop device. Because the VRF is not IFF_LOOPBACK, the existing loopback detection misses the local route and rdma_set_src_addr_rcu() leaves the RDMA destination MAC pointing at the VRF netdevice, causing ib_write_bw to time out. The upstream fix replaces the IFF_LOOPBACK check with a route-type (RTN_LOCAL / RTF_LOCAL) check so loopback is detected correctly with or without VRF.

Adjustments for 6.6:

  • dst_rtable() is not present in v6.6 yet; replaced with a direct (const struct rtable *) cast (struct dst_entry is the first field of struct rtable, so it is equivalent to the upstream container_of).
  • Verified that the patch applies cleanly to both v6.6.121 and v6.6.139 (drivers/infiniband/core/addr.c is byte-identical between them).

Spec/manifest changes (Release 1 -> 2, matching the BBR3 / IKCONFIG_PROC companion-bump pattern):

  • SPECS/kernel/kernel.spec: add Patch1, bump Release
  • SPECS/kernel/kernel-uki.spec: bump Release to match kernel
  • SPECS/kernel-64k/kernel-64k.spec: bump Release to match kernel
  • SPECS/kernel-headers/kernel-headers.spec: bump Release to match kernel
  • SPECS-EXTENDED/kernel-ipe/kernel-ipe.spec: bump Release to match kernel
  • SPECS-SIGNED/kernel-signed/kernel-signed.spec: bump Release to match kernel
  • SPECS-SIGNED/kernel-64k-signed/kernel-64k-signed.spec: bump Release to match kernel
  • SPECS-SIGNED/kernel-uki-signed/kernel-uki-signed.spec: bump Release to match kernel
  • toolkit/resources/manifests/package/{pkggen_core,toolchain}_{x86_64,aarch64}.txt: bump kernel-headers / kernel-cross-headers to -2
Change Log
  • Change
  • Change
  • Change
Does this affect the toolchain?

YES

Associated issues
  • #xxxx
Links to CVEs
Test Methodology
  • Pipeline build id: xxxx

Backport upstream commit c31e4038c97f ("RDMA/core: Use route entry flag
to decide on loopback traffic") to the 6.6.139.1 AZL3 kernel.

In multi-NIC RoCE setups where the source and destination IPs live on
netdevs enslaved to a VRF, addr_resolve() picks the VRF as the next-hop
device. Because the VRF is not IFF_LOOPBACK, the existing loopback
detection misses the local route and rdma_set_src_addr_rcu() leaves the
RDMA destination MAC pointing at the VRF netdevice, causing ib_write_bw
to time out. The upstream fix replaces the IFF_LOOPBACK check with a
route-type (RTN_LOCAL / RTF_LOCAL) check so loopback is detected
correctly with or without VRF.

Adjustments for 6.6:

- dst_rtable() is not present in v6.6 yet; replaced with a direct
  (const struct rtable *) cast (struct dst_entry is the first field of
  struct rtable, so it is equivalent to the upstream container_of).
- Verified that the patch applies cleanly to both v6.6.121 and v6.6.139
  (drivers/infiniband/core/addr.c is byte-identical between them).

Spec/manifest changes (Release 1 -> 2, matching the BBR3 / IKCONFIG_PROC
companion-bump pattern):

- SPECS/kernel/kernel.spec: add Patch1, bump Release
- SPECS/kernel/kernel-uki.spec: bump Release to match kernel
- SPECS/kernel-64k/kernel-64k.spec: bump Release to match kernel
- SPECS/kernel-headers/kernel-headers.spec: bump Release to match kernel
- SPECS-EXTENDED/kernel-ipe/kernel-ipe.spec: bump Release to match kernel
- SPECS-SIGNED/kernel-signed/kernel-signed.spec: bump Release to match kernel
- SPECS-SIGNED/kernel-64k-signed/kernel-64k-signed.spec: bump Release to match kernel
- SPECS-SIGNED/kernel-uki-signed/kernel-uki-signed.spec: bump Release to match kernel
- toolkit/resources/manifests/package/{pkggen_core,toolchain}_{x86_64,aarch64}.txt:
  bump kernel-headers / kernel-cross-headers to -2
@AZaugg AZaugg requested review from a team as code owners May 27, 2026 21:13
@microsoft-github-policy-service microsoft-github-policy-service Bot added Packaging specs-extended PR to fix SPECS-EXTENDED 3.0 Issues and PRs for Azure Linux 3.0 labels May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.0 Issues and PRs for Azure Linux 3.0 Packaging specs-extended PR to fix SPECS-EXTENDED

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant