Skip to content

Commit 9181333

Browse files
authored
Merge pull request #5697 from Azure/add-dranet-blog
Add dranet blog
2 parents 8a094df + 9d779af commit 9181333

9 files changed

Lines changed: 381 additions & 2 deletions

File tree

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
---
2+
config:
3+
theme: base
4+
themeVariables:
5+
primaryColor: "#9f62eb"
6+
---
7+
xychart-beta
8+
title "NCCL all_reduce_perf — Avg Bus Bandwidth (GB/s)"
9+
x-axis ["1nic-unaligned (cross-NUMA)", "1nic-aligned (same NUMA)", "2nic-aligned (same NUMA)"]
10+
y-axis "Avg busbw (GB/s)" 0 --> 120
11+
bar [25, 56, 112]
Lines changed: 1 addition & 0 deletions
Loading
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
flowchart TB
2+
subgraph User["Workload Author"]
3+
RCT["ResourceClaimTemplate<br/>(CEL selectors)"]
4+
PodSpec["Pod Spec<br/>(resourceClaims reference)"]
5+
end
6+
7+
subgraph CP["Kubernetes Control Plane"]
8+
API["API Server<br/>(DRA API group)"]
9+
Sched["Scheduler<br/>(Topology-aware)"]
10+
RS_GPU["ResourceSlice<br/>(gpu.nvidia.com)<br/>pciBusID, NUMA, pcieRoot"]
11+
RS_NIC["ResourceSlice<br/>(dra.net)<br/>rdmaDevice, NUMA, pciAddress"]
12+
end
13+
14+
subgraph Node["Kubernetes Node (Azure ND GB300-v6)"]
15+
NVDRV["NVIDIA GPU DRA Driver<br/>(DaemonSet)"]
16+
DRANETDRV["DRANET DRA Driver<br/>(DaemonSet)"]
17+
end
18+
19+
%% User submits workload
20+
PodSpec -->|"Submit pod with<br/>resource claims"| API
21+
RCT -->|"Define GPU+NIC<br/>alignment constraints"| API
22+
23+
%% Drivers publish device topology
24+
NVDRV -->|"Discover GPUs &<br/>publish topology"| RS_GPU
25+
DRANETDRV -->|"Discover NICs &<br/>publish topology"| RS_NIC
26+
27+
%% Scheduler uses slices to allocate
28+
RS_GPU --> Sched
29+
RS_NIC --> Sched
30+
Sched -->|"Evaluate CEL selectors"| API
31+
API -->|"Bind pod to node<br/>with allocated devices"| Node
32+
33+
%% Styling
34+
style User fill:#fef7e0,stroke:#fbbc04
35+
style CP fill:#e8f0fe,stroke:#4285f4
36+
style Node fill:#f3e8fd,stroke:#9f62eb

website/blog/2026-04-01-dranet-rdma-optimization-for-ai-on-aks/control-plane-diagram.svg

Lines changed: 1 addition & 0 deletions
Loading
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
flowchart TB
2+
Kubelet["Kubelet"]
3+
CRI["containerd"]
4+
NRI["NRI Plugin<br/>(DRANET)"]
5+
6+
subgraph NUMA0["NUMA Node 0"]
7+
GPU0["GPU 0<br/>NVIDIA GB300"]
8+
GPU1["GPU 1<br/>NVIDIA GB300"]
9+
NIC0["NIC 0<br/>NVIDIA ConnectX-8"]
10+
NIC1["NIC 1<br/>NVIDIA ConnectX-8"]
11+
end
12+
13+
subgraph NUMA1["NUMA Node 1"]
14+
GPU2["GPU 2<br/>NVIDIA GB300"]
15+
GPU3["GPU 3<br/>NVIDIA GB300"]
16+
NIC2["NIC 2<br/>NVIDIA ConnectX-8"]
17+
NIC3["NIC 3<br/>NVIDIA ConnectX-8"]
18+
end
19+
20+
subgraph Pod["Scheduled Pod"]
21+
Container["Container<br/>/dev/infiniband/uverbs*"]
22+
end
23+
24+
%% Runtime flow
25+
Kubelet -->|"1. Receive device allocation<br/>result from API Server"| CRI
26+
CRI -->|"2. Execute OCI CreateContainer<br/>hook"| NRI
27+
NRI -->|"3. Inject allocated<br/>/dev/infiniband/* devices"| Pod
28+
29+
%% NUMA-aligned GDR paths
30+
GPU0 <-.->|"PCIe · GDR ✓"| NIC0
31+
GPU1 <-.->|"PCIe · GDR ✓"| NIC1
32+
GPU2 <-.->|"PCIe · GDR ✓"| NIC2
33+
GPU3 <-.->|"PCIe · GDR ✓"| NIC3
34+
35+
%% Cross-NUMA penalty
36+
GPU0 <-.->|"QPI/UPI · No GDR ✗"| NIC3
37+
38+
%% Pod uses aligned devices
39+
Container -.->|"4. NCCL uses<br/>GPU * + mlx5_*"| GPU0
40+
41+
%% Styling
42+
style NUMA0 fill:#e6f4ea,stroke:#34a853
43+
style NUMA1 fill:#fce8e6,stroke:#ea4335
44+
style Pod fill:#fef7e0,stroke:#fbbc04

website/blog/2026-04-01-dranet-rdma-optimization-for-ai-on-aks/data-plane-diagram.svg

Lines changed: 1 addition & 0 deletions
Loading

0 commit comments

Comments
 (0)