KAR-10: High-Performance Pod-to-Pod Communication by ritazh · Pull Request #44 · kubernetes-sigs/ai-conformance

ritazh · 2026-02-23T05:41:17Z

Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>

puja108 · 2026-02-23T11:11:33Z

+- Deploy a pod to a node with high-performance network hardware and confirm that the pod's network namespace contains the expected additional network interface(s).
+- Query the cluster for published network resource characteristics and validate that they accurately describe the available high-performance network capabilities.
+- Deploy a workload requesting a specific network capability and verify it is scheduled on an appropriate node.
+- Deploy two pods with access to high-performance network interfaces and verify successful pod-to-pod data transfer over those interfaces.


Just for clarification, do we want to consider successful transfer as accepted or do we want to see certain throughput measures to? IMHO it's fine to just test that it works and trust vendors on actual performance, as actually setting up thresholds when we would consider something high-performance, which I feel might be too much, as we need to consider typical values for current hardware implementations and differentiate same node vs note-to-node solutions.

IMHO it's fine to just test that it works and trust vendors on actual performance

+1

Deploy two pods with access to high-performance network interfaces and verify successful pod-to-pod data transfer over those interfaces.

This only talks about testing the transfer works, no mention of testing performance. I think we are good here?

janetkuo · 2026-02-24T21:34:46Z

+
+If high performance pod-to-pod communication is needed, then provide well-defined mechanisms for these specialized network resources to be managed and exposed such that their characteristics should be discoverable to enable informed scheduling or workload configuration and to enable pods to attach to multiple network interfaces.
+
+Forward-looking: Once the network resource supports DRA, then the platform should use the DRA mechanism.


We have DRANET today, so it's not just forward looking.

Also, as @aojea highlighted in #10 (comment), the upstream networking community has already decided to "use DRA for anything multi network so we can standardize the ecosystem using common APIs". We should make standardizing on DRA the primary recommendation now, rather than an eventual goal.

This "Forward-looking" section is mirroring other KARs that mention using DRA as the mechanism as forward looking. This was feedback from users who have expressed that many vendors are still catching up and may not have a supported DRA implementation yet. e.g. https://github.com/kubernetes-sigs/wg-ai-conformance/blob/f8773d3f2ffed4aa23442df8413f76c642412e8b/kars/0003-gpu-sharing/README.md?plain=1#L7

SHOULD itself signals the direction, so unlike in MUSTs we don't need to say "forward-looking" here if it's available today. We will only graduate it to MUST after it has met the criteria, so vendors will still have time to catch up.

Both https://github.com/kubernetes-sigs/wg-ai-conformance/blob/f8773d3f2ffed4aa23442df8413f76c642412e8b/kars/0003-gpu-sharing/README.md?plain=1#L7 and https://github.com/kubernetes-sigs/wg-ai-conformance/blob/f8773d3f2ffed4aa23442df8413f76c642412e8b/kars/0004-virtualized-accelerators/README.md?plain=1#L7 are SHOULDs and similar to the concerns raised in those, not all secondary network interfaces have been integrated with DRANet right?

dranet is agnostic for network interfaces, so all network interfaces can work with dranet is true.
But ... there is infiniband interfaces that are presented as linux devices (not as network interfaces) but are also used for RDMA that are not supported yet by DRANET.
There was also a PR to implement it google/dranet#151 but it didn't merge because it could not be tested ... if I have access to the hardware I can add support very quickly

Thanks @aojea for the clarification! This is helpful. It confirms that DRANET is broadly applicable today, with the specific exception of InfiniBand interfaces exposed as Linux devices rather than network interfaces but is getting worked on. Given this, I've updated this KAR to make DRA the primary recommendation and remove the 'forward-looking' framing.

justinsb · 2026-02-26T12:41:11Z

+
+Automated tests should verify the outcomes above:
+
+- Deploy a pod to a node with high-performance network hardware and confirm that the pod's network namespace contains the expected additional network interface(s).


I'm wondering if we want to lean on the "high performance" aspect or the "additional interface" aspect of these connections. "High performance" is tricky because 10Gbps might be high performance in an edge cluster, but would be unacceptably slow in a training cluster. I think the key functionality we want to describe is the idea that some clusters can offer more bandwidth / lower latency / less jitter etc to workloads that request it. The way that is exposed to the pod is not by configuring the primary interface, but by configuring a secondary interface in the pod.

I agree that the goal is higher performance today, but the objective difference is that there's a second interface which has different properties from the "pod network". (At least IIUC).

As to whether we want to "bake in" the complexity of requiring a second interface... I don't know.

I was thinking in a similar direction with the comment above. IMO defining "high performance" will be quite impractical as it boils down to "it depends".

Somehow related also we talked if we should maybe lean mainly on the "use DRA for this" aspect as Janet mentioned above, which would help us avoid "baking it in", right?

"High performance" is tricky

+1

defining "high performance" will be quite impractical

+1

Updated few places s/high performance/secondary network interface PTAL

Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>

aojea · 2026-02-27T07:14:53Z

+Validate the following observable outcomes:
+
+1. **Multiple network interfaces are available to pods:** A pod scheduled on a node with multiple network interfaces has access to secondary network interfaces beyond the default pod network.
+2. **Network resource characteristics are discoverable:** The characteristics of available secondary network interfaces (e.g., interface type, bandwidth, RDMA capability) are published and queryable within the cluster, enabling workloads and schedulers to make informed decisions.


in addition to bandwith and capabilities, one of the critical part for optimal performance is also to be able to expose attributes like the pciBus and numa nodes ... I think those are important attributes that should be exposed

Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>

ritazh · 2026-03-03T21:40:48Z

I've addressed your comments. PTAL @justinsb @janetkuo @puja108 @aojea

terrytangyuan

/lgtm
/hold

janetkuo

/lgtm

dims · 2026-03-10T13:05:47Z

+1 to SHOULD. let's make sure we get really good signal that this is appropriate across multiple distros etc before we promote.

/approve
/lgtm

k8s-ci-robot · 2026-03-10T13:05:56Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, janetkuo, ritazh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [dims,janetkuo,ritazh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

janetkuo · 2026-03-10T18:40:45Z

/unhold

aojea · 2026-03-17T12:22:12Z

💘

KAR-10: High-Performance Pod-to-Pod Communication

d1b3d60

Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>

k8s-ci-robot requested review from dims and terrytangyuan February 23, 2026 05:41

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 23, 2026

puja108 reviewed Feb 23, 2026

View reviewed changes

janetkuo mentioned this pull request Feb 25, 2026

[Networking] High-Performance Pod-to-Pod Communication #10

Open

2 tasks

janetkuo reviewed Feb 26, 2026

View reviewed changes

justinsb reviewed Feb 26, 2026

View reviewed changes

address comments

8af37c6

Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>

aojea reviewed Feb 27, 2026

View reviewed changes

address comments

dadda93

Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>

ritazh force-pushed the kar-10 branch from f65b3e4 to dadda93 Compare March 3, 2026 21:38

terrytangyuan reviewed Mar 3, 2026

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 3, 2026

k8s-ci-robot assigned terrytangyuan Mar 3, 2026

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2026

janetkuo approved these changes Mar 3, 2026

View reviewed changes

k8s-ci-robot assigned janetkuo Mar 3, 2026

k8s-ci-robot assigned dims Mar 10, 2026

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 10, 2026

k8s-ci-robot merged commit 22b33ba into kubernetes-sigs:main Mar 10, 2026
2 checks passed


		If high performance pod-to-pod communication is needed, then provide well-defined mechanisms for these specialized network resources to be managed and exposed such that their characteristics should be discoverable to enable informed scheduling or workload configuration and to enable pods to attach to multiple network interfaces.

		Forward-looking: Once the network resource supports DRA, then the platform should use the DRA mechanism.


		Automated tests should verify the outcomes above:

		- Deploy a pod to a node with high-performance network hardware and confirm that the pod's network namespace contains the expected additional network interface(s).

Conversation

ritazh commented Feb 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ritazh Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

janetkuo Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ritazh commented Mar 3, 2026

Uh oh!

terrytangyuan left a comment

Choose a reason for hiding this comment

Uh oh!

janetkuo left a comment

Choose a reason for hiding this comment

Uh oh!

dims commented Mar 10, 2026

Uh oh!

k8s-ci-robot commented Mar 10, 2026

Uh oh!

janetkuo commented Mar 10, 2026

Uh oh!

Uh oh!

aojea commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

ritazh Feb 26, 2026 •

edited

Loading

janetkuo Feb 24, 2026 •

edited

Loading