Releases: kubernetes-sigs/dranet
Releases · kubernetes-sigs/dranet
v1.3.0
Changes by Kind
Feature
- Add azure specific rules and routes to be configured for secondary nic. (#156, @tamilmani1989)
- Dranet no longer injects child VFs attached to an excluded uplink as pod network interfaces. (#176, @anson627)
Bug or Regression
- Address several critical edge cases in default interface detection (#153, @aojea)
- Fixed DRANET failing to publish any ResourceSlice on nodes where an interface had enough global-scope IP addresses for the joined
dra.net/ipv4ordra.net/ipv6attribute to exceed DRA's 64-byte string attribute limit (commonly triggered bykube-ipvs0on clusters running kube-proxy in IPVS mode). The oversized IP attribute is now omitted on the offending device; all other attributes and devices in the slice are unaffected. (#198, @dkennetzoracle) - Fixed issue where RDMA device is not restored to the host network namespace in rdma netns=exclusive mode (#180, @kanlkan)
- PCI network devices whose kernel driver has been unbound or replaced with a userspace driver (vfio-pci, uio_pci_generic, igb_uio, pci-stub) are no longer published in the ResourceSlice. Previously such devices were published but every NodePrepareResources call for them failed, trapping pods in FailedPrepareDynamicResources. (#193, @wevans-ant)
All changes
- fix: Remove 'dranet' from registry name to avoid duplication by @gauravkghildiyal in #154
- fix: Install npm during netlify build by @gauravkghildiyal in #155
- accurately detect default gateways by @aojea in #153
- helm: default image tag to Chart.AppVersion by @fmuyassarov in #160
- Add OKE GB200 examples: DRA NIC allocation, MNNVL, placement-group by @dkennetzoracle in #157
- Remove obsolete github pages action by @gauravkghildiyal in #171
- docs: fix dead link in with .md extension by @ngcxy in #173
- prevent uplink child virtual function injected into pods by @anson627 in #176
- Azure: generate per-device network config from IMDS metadata by @tamilmani1989 in #156
- Bump golang version to 1.26 by @ngcxy in #184
- Fix postsubmit failure caused by Bats tests for metric server by @ngcxy in #189
- add GPU EFA example for AWS EKS by @anson627 in #182
- Add AKS AMD GPU example and consolidate Nvidia GPU example by @anson627 in #188
- Update release process doc and improve release image tagging by @gauravkghildiyal in #185
- Add tamilmani1989 as reviewer by @tamilmani1989 in #192
- Bug Fix: RDMA device is not restored to the host network namespace in rdma netns=exclusive mode by @kanlkan in #180
- Improve look for the docs and add dark mode by @gauravkghildiyal in #195
- fix: Skip PCI network devices whose kernel driver is unbound by @wevans-ant in #193
- Fix typos: priviledge, directoy, lenght in docs and Dockerfile by @SAY-5 in #197
- Resourceslice attr length overflow by @dkennetzoracle in #198
- Consolidate pod-level state into podConfigStore by @purvavj in #191
- chore: drop unnecessary helm installation during workflow by @fmuyassarov in #201
- ci: declare contents: read on bats, helm-lint, periodics, test by @arpitjain099 in #200
- Add PyTorch training example by @anson627 in #202
- Add NIXL kv cache transfer example by @anson627 in #203
New Contributors
- @ngcxy made their first contribution in #173
- @wevans-ant made their first contribution in #193
- @SAY-5 made their first contribution in #197
- @purvavj made their first contribution in #191
- @arpitjain099 made their first contribution in #200
Full Changelog: v1.2.0...v1.3.0
v1.2.0
Features
- 🚀 OKE (Oracle Cloud) Support: OKE (Oracle Cloud) provider for dranet (#116, #142, @dkennetzoracle)
- 🚀 AWS EFA Support: AWS cloud provider support to enable EFA (Elastic Fabric Adapter) device discovery and configuration for AWS EC2 instances with NVIDIA GPU and AWS Neuron accelerators that support EFA devices. (#139, @nakshah87)
- A cloud-provider-hint command line flag can be used to allow the administrator to specify what cloud they are running on. If specified it will skip automatic discovery, allowing for faster startup. (#118, @michaelasp)
- Add RBAC permissions for the resourceclaims/driver subresource with associated-node:patch and associated-node:update verbs, scoped to the "dra.net" driver name. Required for Kubernetes v1.36+ where the DRAResourceClaimGranularStatusAuthorization feature gate is beta and enabled by default. (#126, @praveen0raj)
Other Changes
- Add system-level dependencies installation step by @aojea in #114
- Changes to provide build and distro images to Dockerfile as arguments by @nakshah87 in #112
- Add OKE (Oracle Cloud) provider for dranet by @dkennetzoracle in #116
- chore: remove unused volume etc by @fmuyassarov in #119
- deployments: add Helm chart for DRANET by @fmuyassarov in #117
- chore: Add PR template and document release notes generation by @gauravkghildiyal in #127
- Provide the ability to hint the cloud provider that is expected to lower startup time. by @michaelasp in #118
- Add resourceclaims/driver RBAC for DRA granular status authorization by @praveen0raj in #126
- chore: expose daemon flags as optional chart values by @fmuyassarov in #124
- Add anson627 as reviewer by @anson627 in #129
- chore: add helm chart packaging & pushing to release process by @fmuyassarov in #130
- feat: bbolt persistent pod device configs by @rbtr in #115
- Changes to support EFA enabled AWS instances on dranet by @nakshah87 in #139
- docs: describe the release process by @fmuyassarov in #131
- Makefile: skip helm push when not on a git tag by @fmuyassarov in #141
- feat: Update oke topology to match customer tenancy data, add example by @dkennetzoracle in #142
- feat: Update helm chart to support affinity, nodeSelector and AWS hints by @erezzarum in #143
- fix: Use github actions for website deployment by @gauravkghildiyal in #150
- feat: Setup boilerplate netlify.toml for website by @gauravkghildiyal in #151
New Contributors
- @nakshah87 made their first contribution in #112
- @dkennetzoracle made their first contribution in #116
- @fmuyassarov made their first contribution in #119
- @praveen0raj made their first contribution in #126
- @erezzarum made their first contribution in #143
Full Changelog: v1.1.0...v1.2.0
v1.1.0
Highlights
-
This release marks a significant milestone in our goal to expand hyperscaler support and drive adoption with multiple cloud providers. We are excited to announce support for Azure, including IB-only RDMA device support and an AKS GB300 configuration example.
-
DRANET now supports VRF and symmetric routing for multihomed Pods. This allows for robust isolation of routing domains in scenarios where a single process utilizes multiple network interfaces.
What's Changed
- add kubernetes ci by @aojea in #10
- add check to fix duplicate IPv6 routes by @aman0408 in #18
- fix: Add buildx support by @tamilmani1989 in #21
- Fix incorrect makefile rule for release by @michaelasp in #25
- Update DRANET to go 1.25 by @MikeZappa87 in #31
- Fix 'kind-image' make target by @MikeZappa87 in #32
- Add mikezappa87 to reviewers in OWNERS file by @aojea in #33
- Dependency update improvements by @stmcginnis in #35
- fix: Add sysfs fallback for RDMA detection on InfiniBand interfaces by @anson627 in #9
- Limit test GitHub Action to main branch by @stmcginnis in #49
- Handle GKE client WithCredentialsFile deprecation by @stmcginnis in #48
- docs: migrate repository links from google to kubernetes-sigs org #46 by @mahmut-Abi in #47
- Add support for CRI-O by @kanlkan in #59
- refactor code to be cloud provider agnostic by @aojea in #75
- fix: update module paths to sigs.k8s.io/dranet by @rbtr in #87
- feat: Push multi-arch images by @gauravkghildiyal in #90
- Add IB-only RDMA device support and AKS GB300 example by @anson627 in #77
- Fallback to get rdma device from sysfs by @tamilmani1989 in #101
- add isSriovVf attribute by @kanlkan in #100
- add vrf support and explain pbr functionality by @aojea in #74
- Add Azure placement group support by @anson627 in #102
- feat: implement robust graceful shutdown logic as a mitigation for NRI plugins not executing during restart by @gauravkghildiyal in #91
- refactor: Introduce PodConfig to capture pod level settings by @gauravkghildiyal in #109
- Add --move-ib-interfaces flag to control IPoIB interface handling by @tamilmani1989 in #98
New Contributors
- @tamilmani1989 made their first contribution in #21
- @MikeZappa87 made their first contribution in #31
- @stmcginnis made their first contribution in #35
- @anson627 made their first contribution in #9
- @mahmut-Abi made their first contribution in #47
- @kanlkan made their first contribution in #59
- @rbtr made their first contribution in #87
Full Changelog: v1.0.1...v1.1.0