-
Notifications
You must be signed in to change notification settings - Fork 496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Conformance]: Adds GatewayClass SupportedVersion Test #3368
base: main
Are you sure you want to change the base?
Conversation
The change basically lgtm, but you'll need to remove the "fixes ..." from the commit message @danehans, Prow doesn't let us do that in commit messages. |
b073179
to
b40de38
Compare
Commit b40de38 removes "Fixes..." from the commit message. Thanks for the review @youngnick. |
b40de38
to
7463a92
Compare
thanks @danehans, does this test pass on any implementations ? |
@arkodg thanks for the review. Yes, this PR was tested with solo-io/gloo#10140 for Gloo Gateway:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @danehans!
Adds a test to ensure implementations conform to the GatewayClass SupportedVersion status condition. Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks !
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: arkodg, danehans The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @danehans!
} | ||
|
||
// Ensure the SupportedVersion is false | ||
kubernetes.GWCMustHaveSupportedVersionConditionFalse(t, s.Client, s.TimeoutConfig, gwc.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some implementations like GKE will ~immediately overwrite this CRD with the original version, can you add that as an acceptable outcome of this test?
Features: []features.FeatureName{ | ||
features.SupportGateway, | ||
}, | ||
Description: "A GatewayClass should set the SupportedVersion condition based on the presence and version of Gateway API CRDs in the cluster", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this makes sense.
Consider I build an implementation that supports v1.1.
v1.2 comes out. Per the spec, even if I am willing to support it, I must set the condition to "false"
This means I cannot pass this conformance test for 1.2 until I upgrade to 1.2 which doesn't actually seem desired?
} | ||
|
||
// Ensure the SupportedVersion status condition is false | ||
kubernetes.GWCMustHaveSupportedVersionConditionFalse(t, s.Client, s.TimeoutConfig, gwc.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really expect implementations to have to watch CRDs and constantly reconcile the Gatewayclass? I get how we got to this point, but it feels like we are building APIs in isolation from considerations about user needs and implementation complexity.
Users are unlikely to ever have an incompatibility and somehow realize to check a random object (that they don't even create, so why would they know to look at it?).
The cost to implement this, however, is extremely high...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW the api-machinery folks have, historically, advised against an implementation reading CRDs at all or even assuming an API is defined by a CRD (vs another mechanism).
I realize this has not much to do with the test... but I missed the PR adding the condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to agree with @howardjohn about this feature: it looks like we are tightly coupling the implementation's internal dependency with the installed Gateway API version. In case Implementation X uses Gateway API 1.3 as a dependency, and Gateway API 1.4 gets released, if a user performs an upgrade of the API before the implementation adoption of that specific version, the GatewayClass gets marked as not accepted anymore.
this test is adding conformance for this section https://gateway-api.sigs.k8s.io/guides/api-design/?h=supportedversi#supported-api-versions, if this is undesirable, then the section should be modified |
Yes, my concern is mostly related to the feature itself, not to this specific conformance test |
The docs don't say that the GatewayClass needs to not be accepted, just that the unsupportedVersion Condition needs to be added. The intent here is to give implementations a way to say "hey, you're running with a version I don't support, maybe check that before logging a support issue". That is, it's intended to be a feature that serves both users and maintainers by giving users a way to check that they are doing supported things. @howardjohn I'm certainly ready to hear feedback on how you think that could be achieved without doing the things documented in the test. Or if you think that the goal is not worthwhile, I'd be interested to hear why. |
The intent makes sense at a high level, I am just not sure it makes sense in practice. Its very easy to come up with ideas to put an ever-increasing list of bits of information into an objects status. But we should do so with extreme caution -- each one adds real risk, complexity, and scale limitations. A slippery-slope strawman is we start forcing implementations to start putting their entire documentation in the status (which is not far off from the SupportFeatures field!). If you look at Kubernetes core, you can see they are very conservative in ensure each status is only critical information. The user journeys for this field don't make sense. Which user is going to know to look at a GatewayClass to find this field? The cost to implement this is high for an implementation. API Machinery folks have consistently strongly recommended against controllers assuming an API is created by a CustomResourceDefinition, and instead relying on API server discovery and the standardized REST APIs to consume APIs; this conformance test strictly enforces reading CRDs. |
Apologies in advance: I'm intending this as an actual question, but I expect that it may sound overly snarky, which isn't my intent. That said -- what's the intended user experience here, and which role are we designing for? I can think of a couple of possibilities:
So... overall, I feel like neither of these hold together all that well, which makes me wonder what I'm missing. 🙂 |
Its slightly worse than your examples actually -- the status is on GatewayClass not Gateway, which I feel is less likely to be looked at |
😂 🤦♂️ and I even read the chunk of the doc before writing those out... clearly time for me to knock off work for the day! But you're right: that makes things weirder still. |
The idea behind all of this is to teach users to check the GatewayClass to find out what their controller supports (this is the point of the The original purpose of all of this is to at least include some information about relevant things that didn't require going out to the implementation's docs site to find the details. @howardjohn, it seems like you don't agree with that goal? Or are you worried about the asymptotic case of having too much information in the GatewayClass status? It seems to me that our options here are as follows:
|
I have a few issues:
If I was the BDFL of the project I would personally just remove the condition entirely. I can see the merit in saying "its extended", though I could also argue its a bit questionable. There are an infinite set of APIs we could add as "extended" because "some implementation may want it"; we should probably be just as cautious in adding extended APIs as core ones, with the main distinction that something is extended IFF it cannot be implemented by everyone. In this case, I would imagine every can implement it. Additionally, if its only implemented across some implementations, it makes it more likely that users will not read GatewayClass status which makes the feature less useful |
@howardjohn, I know you don't like GatewayClass because it's cluster scoped, but the intent behind GatewayClass was always:
This Condition is an extension of this idea, it's a way to signal to users (probably Ian and Chihiro rather than Ana) that an implementation may not fully support a particular version. It's not intended to be a be-all-and-end-all, but a breadcrumb for users to follow. I think that having some way to signal to Ian or Chihiro that the CRD update you just installed may not work for every implementation would be a very useful feature, personally, that would save having to go and find the correct version matrix page on every implementation you have in your cluster. Regarding watching CRD objects, I'd love to have a way not to do that, but currently, the versioning problem means that we don't. There's no way, aside from checking the CRD, for an implementation to know what version of the CRD is installed. This is fine for single-implementation CRDs, because generally you install the CRDs and the implementation in the same operation, and there's a tight coupling between versions. But this breaks down for a community CRD like this one. I've asked in #api-machinery on Kube Slack for some more guidance here. |
Trying to catch up on this thread, I think there are two conflicting ideas at play here:
I think it may be helpful to consider an alternative here where implementations have a way to communicate via GWC status the range of versions they support, and then tools like gwctl or a theoretical future CRD installation tool can warn when there are incompatible versions. Spitballing here, but this could involve a I think I agree with @howardjohn that many/most users are unlikely to look at GatewayClass status, but I think that we can build useful tooling that does, and that could potentially dramatically improve the overall UX of working with Gateway API. |
I had a chat with some folks in #sig-apimachinery about this (see https://kubernetes.slack.com/archives/C0EG7JC6T/p1733285894738669), and the consensus was that, while it's important to be careful about watching CRDs, it's okay in Gateway API's case because:
So we probably can do That said, I really like @robscott's idea of a |
This still doesn't make sense. Gateway API is a backwards compatible stable API. As an implementation today, I support v1.2, and v1.3 when it comes out, and v1.999 when it comes out. I may not support a new feature in a new version, but that is already a part of the API. Users of an API really probably shouldn't actually worry about the version, they care about features. For example, as a user of |
How can this happen? As @howardjohn already said, Gateway API is a backward-compatible API, and this should never happen. An implementation can implement feature X or feature Y, and this is already captured in the |
This should not be possible within standard channel, but is entirely possible within experimental channel. I would request that we move this discussion over to #3494 since this PR is just about a conformance test and not the overall usefulness of the feature. |
What type of PR is this?
/kind test
/area conformance
What this PR does / why we need it:
Adds a test to ensure implementations conform to the GatewayClass SupportedVersion status condition.
Which issue(s) this PR fixes:
Fixes #3367
Does this PR introduce a user-facing change?: