-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for TCP_UDP to NLB TargetGroups and Listeners #2275
base: main
Are you sure you want to change the base?
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Welcome @amorey! |
Hi @amorey. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: amorey The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@amorey Thanks for taking this on! I'm going to try to test it out 🎉 In the meantime any chance you could sign the CLA? 🙏🏼 |
@Yasumoto Just signed the CLA. Let me know if you have any feedback! |
Hm, it looks like the NLB was created successfully, but I think I'm running into aws/containers-roadmap#512 I'm getting this error (slightly edited for brevity) when listening on port
Which I think is from a gated validation check. I'm running on EKS, so not entirely sure if I can tweak the right knobs to turn that on, so suggestions appreciated! 🙇🏼 [edit] And to be clear, this does look like it worked correctly! Curious how the
|
Yep, looks like it's very new, and disabled by default in upstream 1.20/1.21. Just to confirm I'm trying to see what my EKS |
Ahh sorry, I forgot to mention that you have to enable the What's the next step? |
The failure log came from an update. I assume you added the port during the update operation? I'm wondering if the AWS Load Balancer Controller somehow saw the updated structure before the update passed validation, acted on it, and did not see the rollback, so did not remove/revert the NLB. If the feature gate is disabled, we should never see such A similar validation path is used for example if you remove all the ports from a Service, so it might be useful to confirm that if you do that in an update, AWS LB Controller sees that change, in which case it's a lower-level issue which should be pursued separately. Apart from that weirdness, I'd say this PR needs to be merged before EKS enables that feature gate (if they agree to do so while it's Alpha), as the feature gate's job is to reject such Services before the supporting infrastructure is ready. It won't go Beta (i.e. enabled by default) until
and without this PR, I don't see any code here (or in the Cloud Provider AWS version) that would reject a mixed-protocol LB, instead I think it would end up generating a bad request for an LB with both TCP and UDP listeners on the same port, which would be "improper" non-support. That suggests the other thing that should be done before that feature gate goes Beta is adding code to handle and reject such Services to the Cloud Provider AWS implementation of LB support, ideally directing people to use this Controller instead for that and other neat features. |
Previously, aws-load-balancer-controller ignored extra overlapping ServicePorts defined in the Kubernetes Service spec if the external port numbers were the same even if the protocols were different (e.g. TCP:53, UDP:53). This behavior prevented users from exposing services that support TCP and UDP on the same external load balancer port number. This patch solves the problem by detecting when a user defines multiple ServicePorts for the same external load balancer port number but using TCP and UDP protocols separately. In such situations, a TCP_UDP TargetGroup and Listener are created and SecurityGroup rules are updated accordingly. If more than two ServicePorts are defined, only the first two mergeable ServicePorts are used. Otherwise, the first ServicePort is used.
if _, ok := mergeableProtocols[port.Protocol]; !ok { | ||
continue | ||
} | ||
if port.NodePort == port0.NodePort && port.Protocol != port0.Protocol { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if this is an nlb-ip LB, NodePort
is the wrong test, as the code in model_build_target_group.go will be using the TargetPort as its destination, e.g. https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/main/pkg/ingress/model_build_target_group.go#L59-L62
The code that processes the annotations and determines the target type is happening much later (the call to buildTargetGroup
via buildListener
immediately after this method is called, so addressing this might require pulling the port-merging to later, or the annotation-reading earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IP-target mode is going to introduce a difficult edge-case:
- The target port can be named
- The port names are unique in the service, i.e. a TCP and UDP port will have different names, even if they end up at the same port number on the target Pod.
- Different pods can have different numeric values for the same target port
hence, it's possible to produce the situation that some of the pods have their TCP and UDP listeners on the same port, and some of the pods have their TCP and UDP listeners on different ports.
At this level, a target port equality test would never pass (Edit: I had the logic backwards earlier), as it'd be seeing the names, numeric lookup for named ports is done later ((m *defaultNetworkingManager) computeNumericalPorts
for bindings, and implicitly by the Endpoints
and consumed by (m *defaultResourceManager) registerPodEndpoints
I think).
So I don't see a way to identify that any pair of named target ports in ip-target mode can be merged without some further hint from the user that, e.g. dns-udp
and dns-tcp
targetPorts will be the same port when resolved on all Pods in the Service; the same service port might be a good hint, but maybe not reliable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible for the logic in (m *defaultResourceManager) registerPodEndpoints
and the defaultEndpointResolver
APIs it calls, to validate that when given a numeric port, that for TCP_UDP two ports exist on the service, and that for non-TCP_UDP, the right one is seen (which means adding the protocol to the ServiceRef in TargetGroupBinding as right now it's not specified if the numeric port is TCP or UDP).
When given a named port for TCP_UDP (if a way can be found to reliably match/merge them), it really should be a pair of names (as the TCP and UDP ports will have different names) and validate that both exist on the service.
That suggests that ServiceRef in TargetGroupBinding would need to be able to list multiple ports.
And then for TCP_UDP, exclude any resulting endpoints where the resulting port for TCP and UDP are different. Right now that won't be checked as for a numeric port, it'll take the first port name that has the right port number, irrespective of TCP or UDP) and for a named port, only one name is currently preserved through the pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplest way forward of-course is to initially limit TCP_UDP support to instance-target NLBs pending further design work. That still suffers from my first comment here, that we don't know at this point that it's an instance-target NLB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to throw another small spanner in the works, spec.allocateLoadBalancerNodePorts
may allow the user to make this value (port.NodePort
and port0.NodePort
) be 0.
That only makes sense for ip-target NLBs though, as instance-target NLBs rely on a NodePort. I'm not sure if the AWS Load Balancer Controller is going to be able to override that setting for instance-target NLBs, but a NodePort equality test should probably never consider two NodePort==0
ports as equal, to avoid making an invalid situation worse.
continue | ||
} | ||
if port.NodePort == port0.NodePort && port.Protocol != port0.Protocol { | ||
port0.Protocol = corev1.Protocol("TCP_UDP") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't add a comment there, but will this require flipping the test tgProtocol != elbv2model.ProtocolUDP
in buildListenerSpec
for TLS support to be tgProtocol == elbv2model.ProtocolTCP
(not sure why the existing test would allow SCTP
to be turned into TLS
, but but perhaps !=UDP && != TCP_UDP
is more consistent with current behaviour). Otherwise a merged TCP_UDP service would be transformed destructively (silently losing the UDP port) into a elbv2model.ProtocolTLS
if provided certificate ARNS and it matched the SSL ports.
In fact, it might be better to throw an error if that case happens, as the merge is between a TCP service that should be converted to a TLS service, and a UDP service that cannot be so-converted, and I don't see anything in the docs that suggests a TLS_UDP
protocol option, or a TLS_DTLS
protocol option which would be even nicer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLS and UDP are on different levels. So that seems like a non-problem.
Ports: []elbv2api.NetworkingPort{ | ||
{ | ||
Protocol: &networkingProtocolTCP, | ||
Port: &port80, | ||
}, | ||
{ | ||
Protocol: &networkingProtocolUDP, | ||
Port: &port80, | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this example, I notice that NetworkingProtocolTCP_UDP
has been added to the TargetGroupBindings type-list, but we haven't actually supported it, instead immediately breaking it up into a pair of ports, TCP and UDP, when creating a TargetGroupBinding resource for the LB in buildTargetGroupBindingNetworking
.
This might catch-out users who're directly creating their own TargetGroupBinding
resources, as the spec points back to that type, although the docs don't actually list the possible values (but probably should, as TargetType
does for example.)
It would be more-consistent to either not add NetworkingProtocolTCP_UDP
(as it's only used briefly inside buildTargetGroupBindingNetworking
), or fully support it in computePermissionsForPeerPort
in pkg/targetgroupbinding/networking_manager.go, which (on brief glance) appears to be the only place that cares about this ports list.
The former is certainly easier, and requiring such users, if using a TCP_UDP
target group, to specify port lists for both TCP and UDP separately doesn't seem like a great hardship. That would also simplify buildTargetGroupBindingNetworking
by merging the two switch statements.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
Some public attention to |
+1 for this feature. Some info for our use case:
Before this feature is landing, does any one find an easy workaround, please ? |
As a workaround, you might be able to use an externally-managed TCP_UDP LB and bind it to a non-LB Service, see https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/use_cases/self_managed_lb/ and https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/targetgroupbinding/targetgroupbinding/. The same suggestion was made in #2759 (comment) by someone who knows more about the AWS LB Controller, so it should work. However, based on this PR's implementation, you might need to explicitly create two I think in terms of this PR, having |
Any progress? |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
any progress? |
@amorey Any progress on this feature? I would believe many people were waiting for this feature release. |
@tylern91 Sorry, I haven't looked at this in a while. It would have taken a while to implement the reviewer's suggestions so I kept using my custom version and since then it's fallen behind the master. If someone else wants to pick up the baton I'd be happy to help. Otherwise, I'm not sure I'll have the time to work on this for a while. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
@oliviassss - is this something you could help us with? Not sure why this issue appears to be abandoned. Being able to support HTTP3 is probably a good reason to prioritise getting this ticket done. |
Any progress on this? Is help needed? |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
Issue
#1608 (comment)
Description
Previously, aws-load-balancer-controller ignored extra overlapping
ServicePorts defined in the Kubernetes Service spec if the external port
numbers were the same even if the protocols were different (e.g. TCP:53,
UDP:53).
This behavior prevented users from exposing services that support TCP
and UDP on the same external load balancer port number.
This patch solves the problem by detecting when a user defines multiple
ServicePorts for the same external load balancer port number but using
TCP and UDP protocols separately. In such situations, a TCP_UDP
TargetGroup and Listener are created and SecurityGroup rules are
updated accordingly. If more than two ServicePorts are defined, only the
first two mergeable ServicePorts are used. Otherwise, the first
ServicePort is used.
Checklist
README.md
, or thedocs
directory)BONUS POINTS checklist: complete for good vibes and maybe prizes?! 🤯