Support L3 connectivity on vlan sub-interfaces#229
Support L3 connectivity on vlan sub-interfaces#229rrajendran17 merged 1 commit intoharvester:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request adds support for L3 connectivity on VLAN sub-interfaces by introducing a new HostNetworkConfig resource and related validation logic.
Changes:
- Introduces new
HostNetworkConfigvalidator for managing L3 connectivity on VLAN sub-interfaces - Adds validation checks to prevent deletion/updates when overlay VMs are using the cluster network
- Updates existing validators (VlanConfig, NAD) to include new HostNetworkConfig and VirtualMachine cache dependencies
- Adds utility functions for managing VLAN sub-interfaces and NAD getters
- Extensive generated code updates for client/controller integration
Reviewed changes
Copilot reviewed 25 out of 102 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/webhook/hostnetworkconfig/validator.go | New validator for HostNetworkConfig resource with create/update/delete validation |
| pkg/webhook/vlanconfig/validator.go | Added overlay VM checks and new cache dependencies |
| pkg/webhook/nad/validator.go | Added overlay VM checks and new cache dependencies |
| pkg/utils/nad.go | Added GetVlanID() and ListAllNads() utility functions |
| pkg/utils/bridge.go | Added functions for VLAN device management |
| pkg/network/iface/vlan.go | New VLAN sub-interface management functions |
| pkg/network/vlan/vlan.go | Added GetBridgelink() method |
| pkg/utils/fakeclients/hostnetworkconfig.go | Fake client for testing HostNetworkConfig |
| pkg/generated/** | Generated client and controller code for HostNetworkConfig |
Comments suppressed due to low confidence (1)
pkg/webhook/hostnetworkconfig/validator.go:1
- The logic appears inverted. The check should verify if overlay VMs exist when the hostnetworkconfig has
Underlayenabled, but the condition on line 362 checks ifUnderlayis true, then checks for overlay NADs. This seems backwards - if underlay is true, we should be checking for overlay dependencies. The condition should likely be!hostnetworkconfig.Spec.Underlayor the comment on line 242 should be updated to clarify the intended logic.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
3940728 to
eb65a3d
Compare
0485ac0 to
eb0a62c
Compare
eb0a62c to
b1a487b
Compare
| return nil | ||
| } | ||
| return c.restClient | ||
| } |
There was a problem hiding this comment.
pkg/generated/clientset/versioned/typed/v1/client.go - This was file was generated as _client.go due to which the defintions were not recognized outside this file, so I had to rename this file from _client.go to client.go.
Ideally for all other existing generated files, pkg/generated/clientset/versioned/typed//v1/<compponent-name_client.go" is the format, but for some reasons, the latest path generated for Node with recent k8s.io is pkg/generated/clientset/versioned/typed/v1 due to which the client file started with _client.go (empty component name). Not sure if this is a bug from a k8s.io client-go.
|
This pull request is now in conflict. Could you fix it @rrajendran17? 🙏 |
w13915984028
left a comment
There was a problem hiding this comment.
Some questions, review is still ongoing, thanks.
b1c0692 to
9209e3f
Compare
|
This pull request is now in conflict. Could you fix it @rrajendran17? 🙏 |
|
Thanks @w13915984028 for the review. I have updated the PR addressing review comments. |
9209e3f to
28dd4c2
Compare
|
Another topic which could be tracked on a new issue/pr: Currently, if user plans to change MTU, all VMs must be stopped, and then the new MTU will be propagated to VMs when VM start again. L3 subinterface & controller, need to consider: (1) It inherites the current MTU from it's parent cluster-network (2) It detects this change and updates the new MTU to the managed subinterfaces (3) When MTU is changed, should L3 workloads be stopped or not? esp. the underlay case, when e.g. thanks. |
28dd4c2 to
7c6f6b4
Compare
Thanks for bringing this out. |
7c6f6b4 to
b612c37
Compare
| // check if any overlay vm exists for the cluster network used by the nad as underlay | ||
| func (v *Validator) checkOverlayVMsUsingClusterNetwork(nad *cniv1.NetworkAttachmentDefinition, vlanID int) error { | ||
| clusterNetwork := utils.GetNadLabel(nad, utils.KeyClusterNetworkLabel) | ||
|
|
There was a problem hiding this comment.
from below context, it looks:
this input nad (being deleted) must be a IsVlanAccessMode NAD first? if it is true, please check it/or check vid == 0 first and return early, the check does not care overlay type nad
func (nc *NetConf) GetVlanID() int {
if nc.IsVlanAccessMode() {
return nc.Vlan
}
return 0
}
pkg/webhook/nad/validator.go
Outdated
|
|
||
| if hostnetworkconfig.Spec.Underlay && int(hostnetworkconfig.Spec.VlanID) == vlanID { | ||
| if err := v.checkifVMExistsForOverlayNADs(); err != nil { | ||
| return fmt.Errorf("hostnetworkconfig %s uses overlay nads on cluster network %s, %w", hostnetworkconfig.Name, clusterNetwork, err) |
There was a problem hiding this comment.
hostnetworkconfig %s is underlay and is being used by overlay nad ...
| } | ||
|
|
||
| //no more than one underlay exists for overlay networks. If user has to change underlay setting, old underlay has to be disabled first. | ||
| if hostnetworkconfig.Spec.Underlay && newhnc.Spec.Underlay { |
There was a problem hiding this comment.
seems need to add below to L251, avoid check upon self in update case
if hostnetworkconfig.Name == newhnc.Name {
continue
}
|
|
||
| //check if vlanconfig contains all the nodes in the cluster | ||
| for _, node := range nodes { | ||
| if !matchedNodes.Contains(node.Name) { |
There was a problem hiding this comment.
skip if node.DeletionTimestamp != nil ?
216cdc3 to
510fad6
Compare
|
This pull request is now in conflict. Could you fix it @rrajendran17? 🙏 |
d45f10a to
5741595
Compare
|
@ibrokethecloud I have the addressed the following comments from our last discussion.Please have a look.Thanks 1.Use status as subresource Regarding 3, charts changes: harvester/charts#456 Note: I will create a separate task to handle node label updates which causes removal of vlan sub interfaces on a node which is being used as underlay.We need node webhook validations to restrict this and must be handled in harvester repo. |
5741595 to
e6742a7
Compare
w13915984028
left a comment
There was a problem hiding this comment.
Some last questions, thanks.
| if intfExists { | ||
| return h.removeHostNetworkInterface(hnc, true) | ||
| } else { | ||
| return hnc, nil |
There was a problem hiding this comment.
the removeHostNetworkInterface can fail on last step removeHostNetworkPerNodeStatus, and when reconciller runs again, due to vlan.GetVlan(hnc.Spec.ClusterNetwork) returns non-existing, it has no chance to call removeHostNetworkPerNodeStatus again
removeHostNetworkInterface
v, err := vlan.GetVlan(hnc.Spec.ClusterNetwork)
...
h.removeHostNetworkPerNodeStatus
could consider to optimize it as:
if !matchNodeSet {
if intfExists {
return h.removeHostNetworkInterface(hnc, true)
}
// always ensure the status is cleaned
err := h.removeHostNetworkPerNodeStatus(hnc)
if err != nil {
return nil, err
}
return hnc, nil
}
| return "", fmt.Errorf("no matching IP found for node %s", nodeName) | ||
| } | ||
| func (h *Handler) removeHostNetworkPerNodeStatus(hnc *networkv1.HostNetworkConfig) error { | ||
| hnCopy := hnc.DeepCopy() |
There was a problem hiding this comment.
add below before L286,
if hnc.Status.NodeStatus == nil || hnc.Status.NodeStatus[nodeName] == nil {
return nil
}
it will make removeHostNetworkPerNodeStatus be idempotency
and can be freely called on OnChange
|
|
||
| //update nodestatus when interface deleted due to node selector changes. | ||
| if onChange { | ||
| return nil, h.removeHostNetworkPerNodeStatus(hnc) |
There was a problem hiding this comment.
should return nil, h.removeHostNetworkPerNodeStatus(hnc) be?
err := h.removeHostNetworkPerNodeStatus(hnc)
if err != nil {
return nil, err
}
return hnc, nil
| } | ||
|
|
||
| //vlan interface chosen as underlay must have vlanconfig spanning all nodes | ||
| if err := v.checkVCSpansAllNodes(newhnc.Spec.ClusterNetwork); err != nil { |
There was a problem hiding this comment.
recalled one scenario:
besides mgmt, all other cluster-network can skip the witness node (the witness node normally only has mgmt related nic)
b3206fb to
4fb3de9
Compare
| cnClient ctlnetworkv1.ClusterNetworkClient | ||
| cnCache ctlnetworkv1.ClusterNetworkCache | ||
| cnController ctlnetworkv1.ClusterNetworkController | ||
| hostNetworkConfigClient ctlnetworkv1.HostNetworkConfigClient |
There was a problem hiding this comment.
client does not seem to be used anywhere
| } | ||
|
|
||
| //reconcile hostnetworkconfig to stop DHCP lease managers associated with the removed uplink | ||
| if err := h.reconcileHostNetwork(vs.Status.ClusterNetwork); err != nil { |
There was a problem hiding this comment.
the only aim of reconcileHostNetwork seems to be to trigger requeue of HostNetworkConfig when vlan config changes.
We can move all this directly toe hostnetworkconfig controller in the agent, and leverage the relatedresource handler from wrangler
https://github.com/harvester/harvester/blob/master/pkg/controller/master/addon/addon.go#L61
This will ensure all changes for hostnetworkconfig exist in that controller only.
There was a problem hiding this comment.
The problem is, we should make sure that the vlanconfig has removed the cluster network uplink successfully before calling hostnetworkconfig to cleanup associated resources during remove and during update/add of vlanconfig we should make sure that the cluster network uplink is available before calling hostnetworkconfig controller to create sub interfaces on top of it.
Would it be better if we enqueue hostnetworkconfig controller from vlanconfig in that case ?
| } | ||
|
|
||
| // node selector matches and host network interface already exists, skip processing | ||
| if intfExists { |
There was a problem hiding this comment.
Is it possible to have a scenario the link is setup but the HostNetworkConfig status update fails, the object will be requeued, and interface will be found and contorller will exit without actually updating the HostNetworkConfig status
There was a problem hiding this comment.
yes, that scenario is possible.I will add code to update the hostnetworkconfig status even if the interface exists.
w13915984028
left a comment
There was a problem hiding this comment.
LGTM, thanks for your big efforts.
3d2ff25 to
3bb30a1
Compare
|
This pull request is now in conflict. Could you fix it @rrajendran17? 🙏 |
| oldhnc := oldObj.(*networkv1.HostNetworkConfig) | ||
| newhnc := newObj.(*networkv1.HostNetworkConfig) | ||
|
|
||
| // ignore the update if the resource is being deleted |
There was a problem hiding this comment.
nit: not needed as the Delete handler will get the deletion requests
pkg/webhook/vlanconfig/validator.go
Outdated
| vmiCache ctlkubevirtv1.VirtualMachineInstanceCache, | ||
| cnCache ctlnetworkv1.ClusterNetworkCache) *Validator { | ||
| cnCache ctlnetworkv1.ClusterNetworkCache, | ||
| hncCache ctlnetworkv1.HostNetworkConfigCache, |
There was a problem hiding this comment.
hncCache and vmCache seem unused
ibrokethecloud
left a comment
There was a problem hiding this comment.
lgtm. thanks for the PR.
233bc6d to
d1c3acb
Compare
Signed-off-by: Renuka Devi Rajendran <renuka.rajendran@suse.com>
245c278 to
32cfb4f
Compare
Problem:
harvester/harvester#8101
harvester/harvester#7834
Solution:
HEP : harvester/harvester#9335
https://github.com/rrajendran17/harvester-harvester/blob/HEP-ipconfig/enhancements/20251015-support-ipconfig-on-clusternetworks.md
https://github.com/rrajendran17/harvester-harvester/blob/HEP-ipconfig/enhancements/20251020-support-user-underlay-selection.md
Related Issue:
harvester/harvester#8101
harvester/harvester#7834
Test plan:
1.https://github.com/rrajendran17/harvester-harvester/blob/HEP-ipconfig/enhancements/20251015-support-ipconfig-on-clusternetworks.md#test-plan
2.https://github.com/rrajendran17/harvester-harvester/blob/HEP-ipconfig/enhancements/20251020-support-user-underlay-selection.md#test-plan