Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-4631: LoadBalancer Service Status Improvements, initial proposal #4632

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

danwinship
Copy link
Contributor

  • One-line PR description: Initial proposal for KEP-4631
  • Other comments:
While updating the e2e load balancer tests after the final removal of
in-tree cloud providers, we have run into three problems:

  1. The tests have hard-coded timeouts (that sometimes differ per
     cloud provider) for deciding how long to wait for the cloud
     provider to update the service. It would make much more sense for
     the cloud provider to just provide information about its status
     on the Service object, so the tests could just monitor that.

  2. The tests recognize that not all cloud providers can implement
     all load balancer features, but in the past this was handled by
     hard-coding the information into the individual tests. (e.g.,
     `e2eskipper.SkipUnlessProviderIs("gce", "gke", "aws")`) These
     skip rules no longer work in the providerless tree, and this
     approach doesn't scale anyway. OTOH, we don't want to have to
     provide a separate `Feature:` tag for each load balancer
     subfeature, or have each cloud provider have to maintain their
     own set of `-ginkgo.skip` rules. It would be better if the e2e
     tests themselves could just figure out, somehow, whether they
     were running under a cloud provider that intends to implement the
     feature they are testing, or a cloud provider that doesn't.

  3. In some cases, because the existing tests were only run on
     certain clouds, it is not clear what the expected semantics are
     on other clouds. For example, since `IPMode: Proxy` load
     balancers can't preserve the client source IP in the way that
     `ExternalTrafficPolicy: Local` expects, should they refuse to
     provision a load balancer at all, or should they provision a load
     balancer that fails to preserve the source IP?

This KEP proposes new additions to `service.Status.LoadBalancer` and
`service.Status.Conditions` to allow cloud providers to better
communicate the status of load balancer support and provisioning, and
new guidelines on how cloud providers should handle load balancers for
services that they cannot fully support.

/assign @aojea @thockin

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 13, 2024
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels May 13, 2024
@k8s-ci-robot k8s-ci-robot requested review from aojea and MikeZappa87 May 13, 2024 12:25
@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 13, 2024
@danwinship
Copy link
Contributor Author

/sig cloud-provider
/sig testing

@k8s-ci-robot k8s-ci-robot added sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels May 13, 2024
@danwinship danwinship changed the title KEP-4631: LoadBalancer Service Static Improvements, initial proposal KEP-4631: LoadBalancer Service Status Improvements, initial proposal May 13, 2024
approach doesn't scale anyway. OTOH, we don't want to have to
provide a separate `Feature:` tag for each load balancer
subfeature, or have each cloud provider have to maintain their
own set of `-ginkgo.skip` rules. It would be better if the e2e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like "smart tests" that have different execution flows depending on what... current e2e for loadbalancers are not good tests as try to test a lot of different things in one execution ... the test should assert on a feature and a behavior, we may need to break existing tests down to see if we still need to do this


3. In some cases, because the existing tests were only run on
certain clouds, it is not clear what the expected semantics are
on other clouds. For example, since `IPMode: Proxy` load
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thockin this was an interesting finding during the implementation of this mode in cloud-provider-kind, we need to try to flesh out more details before going to GA

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem isn't with the IPMode KEP; the distinction between "proxy mode" and "VIP mode" already existed before that; it's just that before IPMode made it explicit, it was controlled implicitly by whether the LB set Hostname or IP.

(Which is to say, even if we dropped IPMode, the problem would still exist. For example, with the default cloud-provider-aws load balancers.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To this specific feature: IMO, eTP=Local means what it says. It's not an awesomely designed feature because it presumes too much, but it says "if you are forwarding external traffic, only choose a local endpoint". Proxy-ish LB impls can't really retain the client IP, regardless of eTP, that doesn't change the meaning.

Looking at API docs for it, I think we can clarify:

     // externalTrafficPolicy describes how nodes distribute service traffic they
     // receive on one of the Service's "externally-facing" addresses (NodePorts,
     // ExternalIPs, and LoadBalancer IPs). If set to "Local", the proxy will configure
     // the service in a way that assumes that external load balancers will take care
     // of balancing the service traffic between nodes, and so each node will deliver
     // traffic only to the node-local endpoints of the service, without masquerading                                                                                                
-    // the client source IP. (Traffic mistakenly sent to a node with no endpoints will
+    // the source IP. (Traffic mistakenly sent to a node with no endpoints will
     // be dropped.) The default value, "Cluster", uses the standard behavior of
     // routing to all endpoints evenly (possibly modified by topology and other
     // features). Note that traffic sent to an External IP or LoadBalancer IP from
     // within the cluster will always get "Cluster" semantics, but clients sending to
     // a NodePort from within the cluster may need to take traffic policy into account
     // when picking a node.

For a Proxy-ish LB the source IP is the proxy itself. eTP=Local should still preserve that. Whether it is useful or not is a question for end users.

Comment on lines +139 to +140
- Allow cloud providers to indicate that they are working on
provisioning load balancer infrastructure, so that
users/operators/tests can distinguish the case of "it is taking a
while for the cloud to provision the load balancer" from "the cloud
has failed to provision a load balancer" and "there is no cloud
provider so load balancers don't work".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

provision a particular `LoadBalancer` service, and why.

- Allow cloud providers to indicate when they have provided an
"imperfect" load balancer that the user may or may not consider to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of degraded mode?

cc: @bowei

so this feels "un-Kubernetes-like", though at the same time, it's not
like the current Kubernetes networking configuration situation is
really great, and there has been some discussion of trying to provide
more explicit and well-defined cluster networking configuration in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this is in the "train has left the station" section :)

Would it be better to add a `Conditions` field to
`v1.LoadBalancerIngress` so that we can specify conditions per element
of `.Status.LoadBalancer.Ingress`? IOW, should it be possible for a
load balancer to express that it has multiple IPs in different states?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should only consider final states, and not partial ones


3. In some cases, because the existing tests were only run on
certain clouds, it is not clear what the expected semantics are
on other clouds. For example, since `IPMode: Proxy` load
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To this specific feature: IMO, eTP=Local means what it says. It's not an awesomely designed feature because it presumes too much, but it says "if you are forwarding external traffic, only choose a local endpoint". Proxy-ish LB impls can't really retain the client IP, regardless of eTP, that doesn't change the meaning.

Looking at API docs for it, I think we can clarify:

     // externalTrafficPolicy describes how nodes distribute service traffic they
     // receive on one of the Service's "externally-facing" addresses (NodePorts,
     // ExternalIPs, and LoadBalancer IPs). If set to "Local", the proxy will configure
     // the service in a way that assumes that external load balancers will take care
     // of balancing the service traffic between nodes, and so each node will deliver
     // traffic only to the node-local endpoints of the service, without masquerading                                                                                                
-    // the client source IP. (Traffic mistakenly sent to a node with no endpoints will
+    // the source IP. (Traffic mistakenly sent to a node with no endpoints will
     // be dropped.) The default value, "Cluster", uses the standard behavior of
     // routing to all endpoints evenly (possibly modified by topology and other
     // features). Note that traffic sent to an External IP or LoadBalancer IP from
     // within the cluster will always get "Cluster" semantics, but clients sending to
     // a NodePort from within the cluster may need to take traffic policy into account
     // when picking a node.

For a Proxy-ish LB the source IP is the proxy itself. eTP=Local should still preserve that. Whether it is useful or not is a question for end users.

- The Service has `AllocateLoadBalancerNodePorts: false`, but the
cloud only supports NodePort-based load balancing.

- The Service is `ExternalTrafficPolicy: Local` but the cloud cannot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not really about IP preservation but has everything to do with not implementing hCNP or some equivalent mechanism. Minor wording change requested

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eTP=Local means what it says

lol, I thought that in the whole "what things should be traffic policy vs what things should be topology" debate (around PreferLocal) we had agreed that eTP=Local doesn't mean what it says, because the actual intent of the feature is "preserve client source IP", not "route traffic in a particular way"; the routing is purely a side effect of making it possible to implement "preserve client source IP". (IOW an implementation of eTP:Local that preserves client IP while doing routing in an unexpected way would be compliant, but an implementation that routes "correctly" but loses client IP is not.)

I think this is not really about IP preservation but has everything to do with not implementing hCNP or some equivalent mechanism.

What I was saying in that example was, if client IP preservation is considered a mandatory-to-implement aspect of eTP:Local, then there's no point in proxy-ish LBs implementing eTP:Local at all, and thus no reason for them to implement HCNP. (Whereas for VIP-ish LBs, there is really no good argument for not implementing HCNP.)

(If we don't think that client IP preservation is required for eTP:Local, then we should just assume all LBs will implement HCNP, and they're just buggy if they don't.)

supports single-stack load balancers, so it would only be able to
serve clients of one IP family.

- The Service is `ExternalTrafficPolicy: Local` but the cloud cannot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one is not a warning - IPMode already indicates this, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think actually no. Azure has this trick where the entire cloud network is aware of the load balancer NAT state, so the LB can DNAT packets to a NodePort without masquerading them, and then the reply packet will get un-DNAT-ed correctly even though logically speaking it doesn't pass through the LB.

update, even if the value of `LoadBalancerProvisioning` remains
`False`.

#### Terminating Condition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not conviced this is useful - what does a user do with this information?

@danwinship danwinship force-pushed the loadbalancerstatus branch from d5813da to a02030a Compare May 23, 2024 14:18
@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 23, 2024
@danwinship danwinship force-pushed the loadbalancerstatus branch from a02030a to 795cdd5 Compare May 24, 2024 14:14
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels May 24, 2024
@danwinship
Copy link
Contributor Author

Updates:

  • Updated for review comments, started filling in the rest of the template ("Test Plan", "Graduation", "Version Skew Strategy", PRR, etc)
  • Removed UNRESOLVED auto-skipping: Antonio agrees it's reasonable to have "tri-state" tests (pass/fail/skip) as long as the semantics are explicit. In some cases this will require rewriting the tests to put the "objectionable" bits at the start, so we can always skip immediately after the initial LB provisioning if the cloud doesn't support the feature.
  • The version skew section made me think about the behavior when a cloud provider is updated and finds itself in a cluster with pre-existing "bad" load balancers, and whether we should maybe add an explicit "unsupported" or "broken" condition.
  • Added a summary section to the top of the "Expected Behavior When the Cloud Provider Doesn't Know That It Can't Implement a Load Balancer" section, which I think is the big open question before this becomes implementable.
  • Added detailed notes to the e2e Test Plan section clarifying what skips will be needed for which tests. Also, while writing out the Test Plan section, I realized that to avoid regressing coverage, we're basically going to have to fork the LB tests, so we can have one hacky GCE-and-kind-only set, and one KEP-4631 set, which will initially be Alpha, but which will eventually replace the hacky ones.

@danwinship danwinship marked this pull request as ready for review June 6, 2024 19:33
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 6, 2024
@k8s-ci-robot k8s-ci-robot requested a review from thockin June 6, 2024 19:33
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 4, 2024
@thockin thockin removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 4, 2024
@soltysh
Copy link
Contributor

soltysh commented Oct 1, 2024

Based on this comment from Tim, I'm assuming the work will happen for 1.32. Please don't forget to fill in the PRR, and its file (see https://github.com/kubernetes/enhancements/blob/master/keps/prod-readiness/template/nnnn.yaml) for this KEP. Feel free to assign me for that approval.

@danwinship
Copy link
Contributor Author

OK, big update, mostly resolving the "when the cloud provider can't implement a feature" question. I went with service.spec.requiredLoadBalancerFeatures, which I think works best overall.

@danwinship
Copy link
Contributor Author

resolving the "when the cloud provider can't implement a feature" question

(I considered dropping that and trying to push through just the LoadBalancerProvisioning and LoadBalancerServing conditions, and saving the "feature discovery" part for later, but it's convenient for clients to be able to assume that if the cloud provider sets the provisioning/serving conditions, then that means it also supports RequiredLoadBalancerFeatures. If we did just the conditions now, then we'd have to add another condition (or something) later to allow the cloud provider to signal its support for feature discovery. Though we could still do that...)


- Helpers for the new API implemented in `k8s.io/cloud-provider`

- cloud-provider-kind and cloud-provider-gcp set conditions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is people today to work with cloud-provider-gcp and review the changes, so this is not likely going to happen and cloud-provider-kind can not be the only thing implementing it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... hm, that's sort of annoying if true since it means we can't really test this fully within k8s CI. We can change this to say AWS or Azure instead, but who will actually end up testing it in that case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the problem with Service loadbalancer, the last changes added to status, like IPMode and the relax validation on multiple TCP and UDP ports (it seems only Azure) were never implemented and they are not likely going to be implemented so I don't know if this is worth the effort ... in theory Gateway API is the things should be replacing this Service type

@soltysh
Copy link
Contributor

soltysh commented Oct 8, 2024

Ad reminder that the PRR file is required for this PR
/hold
to make sure we don't merge that w/o it

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 8, 2024
@danwinship
Copy link
Contributor Author

Yeah, I haven't done the PRR part of the KEP yet because we hadn't actually decided on the scope of the work yet. (The enhancements freeze snuck up on my and I'm assuming I've missed it for 1.32.)

@soltysh
Copy link
Contributor

soltysh commented Oct 9, 2024

Yeah, I haven't done the PRR part of the KEP yet because we hadn't actually decided on the scope of the work yet. (The enhancements freeze snuck up on my and I'm assuming I've missed it for 1.32.)

There's still time, today and tomorrow. The freeze is 19:00 PDT Thursday 10th October 2024 😉

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly the alpha PRR looks good, there are a few minor bits in tests (units + enablement/disablement) and that alpha criteria comment from Antonio.

- [ ] (R) Production readiness review approved
- [ ] "Implementation History" section is up-to-date for milestone
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: make sure to check the appropriate boxes.

provisioned, because of a problem with the underlying cloud
infrastructure (explained further in `Message`).

#### The `LoadBalancerProvisioning` Condition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have both LoadBalancerProvisioning and LoadBalancerServing with Status=False, Reason=Provisioning, why one and only one isn't sufficient? It seems this specific condition is more expressive, so I'd suggest dropping the latter. Having both might be confusing to users. Or asking differently, what's the difference between the two?

Copy link
Contributor Author

@danwinship danwinship Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between the two is that Provisioning tells you "is the cloud provider currently trying to bring the cloud infrastructure in line with the Service definition?" and Serving tells you "is there a working cloud load balancer?"

Note that all four combinations of values are meaningful.

Also, having a separate Provisioning condition lets you kubectl wait --for condition=LoadBalancerProvisioning=False (ie, wait until the cloud provider has either succeeded in provisioning a load balancer or else declared failure), which you couldn't do if the provisioning was a "step" of the Serving condition. (Well, you'd have to use --for jsonpath='...' with some expression to find a LoadBalancerServing condition (of any Value) whose Reason isn't Provisioning.)

All that said, Tim had the same objection so either I'm explaining this badly, or else no one else thinks the distinction is useful...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen that text, and now it makes more sense, thank you. What confused me was that in this section you've used Provisioning and Serving which I assumed are the reason, not the whole condition. What would make it explicitly clear is to use the full name of the condition in that text.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with them, after reading that response, but I still want clarify on Serving (see above)


There will be a few unit tests for validation/defaulting of the new
API, but this feature can really only be tested via e2e tests, since
it depends on external components.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you list the packages with their coverage, following the template.


###### Are there any tests for feature enablement/disablement?

TODO (we should make sure the cloud provider updates pre-existing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This answer is required for alpha.

Pick one more of these and delete the rest.
-->

- [ ] Metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not required for alpha, but it's nice to think through those.

###### Will enabling / using this feature result in increasing size or count of the existing API objects?

It will add 2 or 3 new conditions to Services of `Type: LoadBalancer`, and
one new optional spec field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you estimate the size?

- the first Kubernetes release where an initial version of the KEP was available
- the version of Kubernetes where the KEP graduated to general availability
- when the KEP was retired or superseded
-->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be filled in.

Why should this KEP _not_ be implemented?
-->

## Alternatives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to have this.

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note that this does not have a PRR file.

Sending first-half notes now, will finish reading in a bit

balancer is handling connections, and `False` if it is not. (There are
no defined semantics for an `Unknown` value of this condition.)

(This means that at construction time, the `LoadBalancerServing`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantically, the .Status.LoadBalancer field means "I have an IP allocated". It doesn't mean that traffic is flowing or that the IP is reachable yet.

For example, it may take seconds or even minutes for infrastructure to actually start receiving and routing traffic.

Does "LoadBalancerServing" mean that we have provisioned the LB or that someone has confirmed that the traffic is flowing (from where? what about LB source ranges?)

Does this line up with Gateway's conditions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhere else I claim that the e2e tests will expect to be able to connect immediately upon LoadBalancerServing becoming True (though we could change that).

I think my intuition was "Serving means that the cloud provider believes that the LB is accepting/redirecting connections". It definitely doesn't imply that the cloud provider has actually tried making a connection. (In addition to source ranges, there's the problem that maybe the cluster end of the service is currently not-ready, which shouldn't affect the LB condition.)

If there is internal status information that tells the cloud provider that the LB has been allocated but is not yet actually serving, then the cloud provider should not mark the service as LoadBalancerServing until the internal status information says it's serving.

If there's an allocated-but-not-yet-serving state, but the cloud provider doesn't have any way of figuring that out other than by trying to connect to the LB, then I guess it needs to call that state "serving"...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this line up with Gateway's conditions?

No... Hm... I must have been looking at an old version of the Gateway API before because I couldn't find any relevant conditions. But it looks like Gateway currently has:

  • Accepted: "The resource is valid enough to produce some configuration in the underlying data plane", which includes "Features being used by the resource are supported". But also "sufficient capacity exists on underlying infrastructure for the Gateway to be provisioned"
  • Ready: (Extended conformance) "traffic is ready to flow through the data plane immediately, not at some eventual point in the future"
  • Programmed: "configuration has been sent to the data plane, and should be ready soon."

The fact that Accepted implies that you won't get an error from the cloud while provisioning the LB implies to me that you can't set Accepted to True until the cloud has returned at least a provisional success status?

In which case it's not clear to me what the difference between Accepted and Programmed is...

way, the cloud provider must update at least the `LastTransitionTime`
of the `LoadBalancerServing` condition. (For example, if the user
changes the service's `LoadBalancerSourceRanges`, and the cloud
provider updates the load balancer for this, it must update the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not generally a requirement of Conditions - I get why that might be valuable information but it seems wrong to overload "LastTransitionTime" to mean anything but "the last time this changed state"

Copy link
Contributor Author

@danwinship danwinship Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. (I was thinking this could potentially help with the e2e tests but I don't actually have any specific use for it.)

provisioned, because of a problem with the underlying cloud
infrastructure (explained further in `Message`).

#### The `LoadBalancerProvisioning` Condition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with them, after reading that response, but I still want clarify on Serving (see above)

object, and `False` if it not. (There are no defined semantics for an
`Unknown` value of this condition.)

Any time any load-balancer-related field (or provider-specific
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other fields are not really timestamped, so unless you know what time you made a change, this semantic is meaningless. If you neeeeed to know this, we should set something atomically in the REST handler and force providers to clear it (same as we are doing for in-place pod updates)


When a load balancer Service is first created (or a non-load-balancer
Service is converted to `Type: LoadBalancer`), one of three things
should immediately happen:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how immediately? If it takes 10 seconds for the controller to act, is that OK?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the controller take 10 seconds to act?

The intent is "the controller should just do this before it starts doing anything potentially-time-consuming".

In the e2e tests we currently assume that if you make a change to a pod or service, that the endpointslice controller will act on that change "immediately" and the test will be able to observe a change to the corresponding EndpointSlice object within 30 seconds. I was planning to use the same requirement for cloud-providers setting the intitial conditions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loadbalancer controller interacts with the cloud API and infra, this calls to the cloud APIs use to be sync and block the workers ... on very large clusters this can take ~ 1 hour per LoadBalancer.
In our scale tests for loadbalancers we have to define at least the same number of workers in the loadbalancer controller as the number of loadbalancers that are going to be created in parallel on the tests, otherwise, you are not able to process any event until some worker is available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This KEP would (necessarily) make that architecture "illegal". One of the major points of the conditions is to be able to distinguish between "no cloud provider is running" and "the cloud provider is taking an hour to allocate your load balancer" (and "the cloud provider already determined that it's not going to be able to allocate your load balancer for some reason"). So the cloud provider needs an informer event loop that doesn't block on the cloud API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no need to change any architecture at all, is just the conditions need to be updated before it gets to the worker ...

because not all cloud providers will support the new conditions.
However, since supporting these conditions will be a prereq for
running the `[Feature:LoadBalancer]` e2e tests, we assume that
eventually all cloud providers will support them.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds suspiciously like a breaking change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

none of the tests are conformance and we don't have any stability guarantees around e2e tests.

But that was written when I had hoped to get this into 1.31, and given that that obviously didn't happen, it's probably better to leave the [Feature:LoadBalancer] tag with its current meaning ("run LB tests under the assumption that the cloud is cloud-provider-gcp; you can manually skip things if you aren't"), and add a new tag for the new ones.

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I let this slip, and now it seems too late, but I am OK with the vast majority of this proposal.

/approve

running the `[Feature:LoadBalancer]` e2e tests, we assume that
eventually all cloud providers will support them.)

When the user updates load-balancer-relevant fields in a Service that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume LB impls should also remove these if the service is updated in a way that removes the LB?

requested session affinity but the cloud provider has provisioned
a load balancer that does not support it.

- `Status=True, Reason=ExternalTrafficPolicyNotSupported`: The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure on this one. IPMode=Proxy already covers this, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IPMode=VIP is necessary for implementing ExternalTrafficPolicy=Local but strictly speaking it's not sufficient. (The source IP could be lost for some other reason.)

Also, IPMode was intended as cloud-provider-to-service-proxy communication, while the conditions are cloud-provider-to-end-user communication. So even if they're technically redundant, they exist for different reasons?

I agree that it's not clearly necessary though. I could drop it.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@aojea
Copy link
Member

aojea commented Oct 11, 2024

I let this slip, and now it seems too late, but I am OK with the vast majority of this proposal.

/approve

/hold

The higher risk here is adding fields that nobody implements, @danwinship and I talked offline, and Dan has a good idea to simplify this proposal that can achieve the goals and solve the issues detailed here

Comment on lines +222 to +223
- `LoadBalancerServing`, to indicate explicitly that the load
balancer is (or isn't) serving traffic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of prefer LoadBalancerReady here, no strong opinion

@danwinship
Copy link
Contributor Author

danwinship commented Dec 10, 2024

I need to get back to this but Antonio and I had sort of decided

  1. We should redesign it with an eye toward being able to implement it provider-agnostically in k8s.io/cloud-provider.
    • It should be possible to at least implement the "provisioning" status there, and probably the "serving" status.
    • Having good Reason values in the case of provisioning failure will probably require adding some new well-known error values.
  2. Don't worry about the feature-detection aspects of it for now.

@aojea
Copy link
Member

aojea commented Dec 10, 2024

agree, with the reduced scope it can be also good for a new contributor that want to drive it through the KEP process ...

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments from last PRR review still hold, don't forget the PRR 4631.yaml file.


# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
alpha: "v1.32"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants