Skip to content

feat(mockibsysfs): ibping support via synthetic umad emulation#366

Closed
giuliocalzo wants to merge 3 commits into
NVIDIA:mainfrom
giuliocalzo:feat/nvml-mock-ibping
Closed

feat(mockibsysfs): ibping support via synthetic umad emulation#366
giuliocalzo wants to merge 3 commits into
NVIDIA:mainfrom
giuliocalzo:feat/nvml-mock-ibping

Conversation

@giuliocalzo

Copy link
Copy Markdown
Contributor

Summary

  • Add umad_mock.c to libibmocksys.so so ibping works on nvml-mock nodes without real ib_umad hardware (vendor OpenIB ping MAD class 0x32 only).
  • Sysfs mocking (ibstat, ibstatus, iblinkinfo) is unchanged; synthetic umad fds handle ioctl / read / write / poll.
  • Same-LID pings are answered in-process; cross-HCA / server-client traffic uses a global file bus at $MOCK_IB_ROOT/umad-bus/{in,out}/ with TID-matched replies.
  • Align default MOCK_IB_ROOT with the chart (/var/lib/nvml-mock/ib), implement REGISTER_AGENT2, and fix mixed-fd poll() delegation to libc.

Test plan

  • make -C pkg/network/mockibsysfs and go test ./pkg/network/mockibsysfs/...
  • go test -tags=integration ./pkg/network/mockibsysfs/render/... (Linux + infiniband-diags)
  • Manual: docker build -f deployments/nvml-mock/Dockerfile — self-ping and cross-HCA (mlx5_0 → LID 2, server on mlx5_1)
  • CI: mockib-integration job
  • CI: nvml-mock-e2evalidate-ibping.sh on IB-enabled profiles

Notes

  • Scope: ibping / vendor ping MADs only — not full libibumad, RDMA, or real InfiniBand.
  • Bump libibmocksys.so to 1.2.0 (rebuild nvml-mock image for Kind/e2e).

@giuliocalzo giuliocalzo force-pushed the feat/nvml-mock-ibping branch from 0913c6f to b476af0 Compare May 26, 2026 07:41
Extend libibmocksys.so with umad ioctl/read/write/poll so ibping works
without ib_umad hardware. Same-LID pings loop in-process; cross-HCA
traffic uses a global umad-bus keyed by MAD TID. Adds e2e validation,
Linux integration test, and CI mockib-integration job.

Signed-off-by: Giulio Calzolari <gcalzolari@nvidia.com>
@giuliocalzo giuliocalzo force-pushed the feat/nvml-mock-ibping branch from b476af0 to ec0790c Compare May 26, 2026 07:41
Signed-off-by: Giulio Calzolari <gcalzolari@nvidia.com>
Resolve CHANGELOG and Helm README conflicts with upstream gb300, PCIe
sysfs, and library padding entries.

Signed-off-by: Giulio Calzolari <gcalzolari@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant