- Release Signoff Checklist
- Summary
- Motivation
- Proposal
- Design Details
- Production Readiness Review Questionnaire
- Implementation History
- Drawbacks
- Alternatives
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable
- (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests for meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
- (R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
- (R) Production readiness review completed
- (R) Production readiness review approved
- "Implementation History" section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website, for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Modern data model definitions like OpenAPI v3 and protobuf (versions 2 and 3) have a keyword to implement “oneof” or "union". They allow APIs to have a better semantics, typically as a way to say “only one of the given fields can be set”. We currently have multiple occurrences of this semantics in kubernetes core types, at least:
- VolumeSource is a structure that holds the definition of all the possible volume types, only one of them must be set, it doesn't have a discriminator.
- DeploymentStrategy is a structure that has a discrminator "DeploymentStrategyType" which decides if "RollingUpate" should be set
The problem with the lack of solution is that:
- The API is implicit, and people don't know how to use it
- Clients can't know how to deal with that, especially if they can't parse the OpenAPI
- Server can't understand the user intent and normalize the object properly
Currently, changing a value in an oneof type is difficult because the semantics is implicit, which means that nothing can be built to automatically fix unions, leading to many bugs and issues:
- kubernetes/kubernetes#35345
- kubernetes/kubernetes#24238
- kubernetes/kubernetes#34292
- kubernetes/kubernetes#6979
- kubernetes/kubernetes#33766
- kubernetes/kubernetes#24198
- kubernetes/kubernetes#60340
And then, for other people:
- rancher/rancher#13584
- helm/charts#12319
- EnMasseProject/enmasse#1974
- helm/charts#11546
- kubernetes/kubernetes#35343
This is replacing a lot of previous work and long-standing effort:
- Initially: kubernetes/community#229, then
- kubernetes/community#278
- kubernetes/community#620
- kubernetes/kubernetes#44597
- kubernetes/kubernetes#50296
- kubernetes/kubernetes#70436
Server-side apply is what enables this proposal to become possible.
Provide a simple mechanism for API authors to label fields of a resource as
members of a oneOf, in order to receive standardized validation and
normalization, rather than having to author it themeselves per
resource as currently done as a workaround in various validation
functions (e.g. pkg/apis/<group>/validation/validation.go
).
- Validation - ensuring only one member field is set (or at most one if desired).
- Normalization - ensuring the API server can understand the intent of clients that are unable to update/modify fields the clients are unaware of due to version skew.
Migrating all existing unions away from their bespoke validation logic (e.g validation functions), is an explicit non-goal and will be pursued in a separate KEP or later release.
As a CRD owner, I can use simple semantics (such as openapi tags/go markers), to express the desired validation of a oneOf (at most one or exactly one field may be set), and the API server will perform this validation automatically.
As a client, I can read, modify, and update the union fields of an object, even if I am not aware of all of the possible fields, and the server will properly interpret my intent.
- We need to ensure we do not break existing union types. This can be done by not forcing existing unions to conform to the newly proposed union semantics. Integration testing with older types should give us the confidence to be sure we have done so.
- There is a lot of risk for errors when there exists skew between clients and server. In the section on normalization, we discuss mitigating these risks.
We propose that all new unions maintain a "discriminator". This is a field that points to which of the other "union member" fields is to be respected as the truly desired union field in the case that there are any conflicts.
In order to demonstrate the need for the discriminator, we developed an extensive test matrix that looks at various configurations of the performing REST operations on a union where the client or server is unaware of a newly added field to the union (due to version skew).
We present a guide doc on how to interpret the test matrix, but the major conclusions are as follows (along with the test case number from the test matrix):
- (Case #22 and #27) If an unstructured client (i.e. a client that represents data as raw json maps with no knowledge of the schema) is unaware of field on the union, but wants to clear the union entirely (assuming the union is optional), it will have no way of doing so without a discriminator. With a discriminator, the client can express its intention by setting the discriminator to the empty value and the server can respect its intentions and clear any fields the client is unaware of.
- (Case #12 and #16) If a structured client is unaware of a field in the union that is set and it just wants to echo back the union it received in a get request (such as when updating other parts of the object), a client without a discriminator will silently drop the currently set field, while a client with the discriminator will not change the discriminator value, indicating to the server that no changes are desired in the union.
- (Case #34 and #39) If a client sets a union field that the server is not aware of, the server will silently drop it and attempt to clear the object of the union field. With a discriminator, the server will see the unrecognized discriminator value and can fail loudly.
- (Case #23 and #28) When a client goes to set a field it knows of, but a separate field it doesn't know about is currently set, the server can simply know to always respect the discriminator. Without a discriminator, the server will have to do convoluted logic to detect that the previously set field has not been modified and that only one of the other union fields has been.
We're proposing a new type of tags for go types (in-tree types, and also kubebuilder types):
// +unionDiscriminator
before a field means that this field is the discriminator for the union. This field MUST be an enum defined as a string (see section on discriminator values). This field MUST be required if there is no default option, omitempty if the default option is the empty string, or optional and omitempty if a default value is specified with the// +default
marker.// +unionMember[=<memberName>][,optional]
before a field means that this field is a member of a union. The<memberName>
is the name of the field that will be set as the discriminator value. It MUST correspond to one of the valid enum values of the discriminator's enum type. It defaults to the go (i.eCamelCase
) representation of the field name if not specified.<memberName>
should only be set if authors want to customize how the fields are represented in the discriminator field.<memberName>
should match the serialized JSON name of the field case-insensitively. The comma separated optional value determines whether or not the member field must be set when the discriminator selects it. Meaning, whenoptional
is present on a field, the discriminator can select the field even if the field is not set. If optional is not present in theunionMember
tag, then the object will fail validation if the discriminator selects the field but it is nil. A field can be marked as optional without specifying memberName via// +unionMember,optional
.
Here we present a description of how discriminators and their valid values should be defined.
As described above, the discriminator field must be a string and required.
Because, there are only a few specific values that the discriminator can be, we
propose that all discriminators should be defined as an enum, and should be
tagged so via the enum go marker // +enum
.
If no option is a valid option for a union, such an option must be defined as a member of the discriminator values enum. By convention, this "no member" discriminator should be the empty string, but there is nothing stopping API authors from defining their own "no option" discriminator value.
In some cases there are more discriminator values than there are member fields
defined in the struct when that specific member requires no configuration. An
example is the DeploymentStrategy
where it has one member field rollingUpdate
,
but two valid discriminator values RollingUpdate
and Recreate
. By using an
enum as the discriminator value we are able to define values beyond the member
fields in order to accommodate this pattern.
Below is an example of how to define a union based on the above design
// +enum
type UnionType string
const (
FieldA UnionType = "FieldA"
FieldB UnionType = "FieldB"
FieldC UnionType = "FieldC"
FieldD UnionType = "FieldD"
FieldNone UnionType = ""
)
type Union struct {
// +unionDiscriminator
// +required
UnionType UnionType
// +unionMember
// +optional
FieldA int
// +unionMember
// +optional
FieldB int
}
Note unions can't span across multiple go structures (all the fields that are part of a union has to be together in the same structure), examples of what is allowed:
// This will have one embedded union.
type TopLevelUnion struct {
Name string `json:"name"`
Union `json:",inline"`
}
// +enum
type UnionType string
const (
FieldA UnionType = "FieldA"
FieldB UnionType = "FieldB"
FieldC UnionType = "FieldC"
FieldD UnionType = "FieldD"
FieldNone UnionType = ""
)
// This will generate one union, with two fields and a discriminator.
type Union struct {
// +unionDiscriminator
// +required
UnionType UnionType `json:"unionType"`
// +unionMember
// +optional
FieldA int `json:"fieldA"`
// +unionMember,optional
// +optional
FieldB int `json:"fieldB"`
}
// +enum
type Union2Type string
const (
Alpha Union2Type = "ALPHA"
Beta = "BETA"
)
// This generates a union where the unionMember markers demonstrate how to
customize the names used for each field in the discriminator.
type Union2 struct {
// +unionDiscriminator
// +required
Type2 Union2Type `json:"type"`
// +unionMember=ALPHA
// +optional
Alpha int `json:"alpha"`
// +unionMember=BETA,optional
// +optional
Beta int `json:"beta"`
}
OpenAPI v3 already allows a "oneOf" form, which is accepted by CRD validation (and will continue to be accepted in the future). That oneOf form will be used for validation, but is "on-top" of this proposal.
A new extension is created in the openapi to describe the behavior:
x-kubernetes-unions
.
This is a list of unions that are part of this structure/object. Each item in the list represents a discriminator for the union, and for each member field, the discriminator value of that field and whether or not that field is optional.
Conversion between OpenAPI v2 and OpenAPI v3 will preserve these fields.
The following is an example of what the generated OpenAPI definition will look like for a given go type.
const (
FieldA Union1Type = "FieldA"
FieldB Union1Type = "FieldB"
FieldC Union1Type = "FieldC"
FieldD Union1Type = "FieldD"
FieldNone Union1Type = ""
)
// This will generate one union, with two fields and a discriminator.
type Union struct {
// +unionDiscriminator
// +required
Union1 Union1Type `json:"union1"`
// +unionMember
// +unionDiscriminatedBy=Union1
// +optional
FieldA int `json:"fieldA"`
// +unionMember,optional
// +unionDiscriminatedBy=Union1
// +optional
FieldB int `json:"fieldB"`
}
The OpenAPI x-kubernetes-unions extension will then be attached to the discriminator's property and deserialized into go structs as follows:
// XKubernetesUnions is the top level extension
type XKubernetesUnions struct {
// FieldMembers are the mapping of all valid discriminator values in a
// union to the corresponding member field.
// Discriminator value is the value to which the discriminator is set to
// in order to indicate that a given member field is the currently set member
// of the union.
// MemberField may be nil in the case of empty union members where a valid
// discriminator value has no corresponding member field.
FieldMembers map[DiscriminatorValue]*MemberField `json:"fieldMembers"`
}
// DiscriminatorValue is the value that the discriminator is set to
// in order to indicate the selection of a union member.
type DiscriminatorValue string
// MemberField
type MemberField struct {
// Name is the name of the field corresponding to the member.
// It will be the json representation of the field marked with the
// `// +unionMember` marker in the go type.
Name string `json:"name"`
// Optional determines whether the discriminator _may_ select this member
// even when the member field is empty. Optional defaults to false.
Optional bool `json:"optional"`
}
Normalization refers to the process by which the API server attempts to understand and correct clients which may provide the server with conflicting or incomplete information about a union in update or patch requests.
Issues primarily arise here because of version skew between a client and a server, such as when a client is unaware of new fields added to a union and thus doesn't know how to clear these new fields when trying to set a different field.
For unions that follow this design, normalization is simple: the server should always respect the discriminator.
This means that when the server receives an update request with a discriminator set to a given field, and multiple member fields are set it should clear all fields except the one pointed to by the discriminator if and only if the discriminator has been modified. Having multiple fields set, and a discriminator not modified is invalid and caught later by the validation step (see below).
For both custom resources and built-in types, we expect union normalization to be called by the request handlers shortly after mutating admission occurs.
Objects must be validated AFTER the normalization process.
Some validation situations specific to unions are:
- When multiple union fields are set and the discriminator has not been modified we should error loudly that the client must change the discriminator if it changes any union member fields.
- When the server receives a request with a discriminator set to a given field, but that given field is empty and not marked as optional, the server should fail with a clear error message. Note this does not apply to discriminator values that do not correspond to any field (as in the "empty union members case").
For both custom resources and built-in types, validation will occur as part of the request validation, before validating admission occurs.
For custom resources, union validation will be done at the same point as the existing structural schema validation that occurs in the custom resource handler. This ensures that any generic validation changes made to all custom resources (such as the ratcheting validation discussed below), behaves appropriately with union validation.
When updating CRDs to support union validation, it is possible that existing CRs become invalid.
The naive solution is to require existing CRs to be updated to a valid state before they can be updated again.
This creates many potential landmines, and so ratcheting validation is proposed as an alternative. Ratcheting validation means that objects will ignore stricter validation rules if and only if the existing object also fails the stricter validation for the same reason.
Ratcheting validation for custom resources is a separate effort proposed outside of this unions effort. For the initial alpha graduation of unions, we do not propose supporting ratcheting validation. We will require all invalid CRs to be made valid before they can be updated (the naive solution).
In order to potentially support ratcheting validation in the future, we will ensure that all callers of union validation retain access to both old and new objects, so that future ratcheting validation can be implemented within the union validation library.
As mentioned, one of the goals is to migrate at least one existing union to using the new marker based union validation and normalization. While open questions remain around the priority and urgency of migrating existing unions, nonetheless we should be able to come to a consensus on which types to migrate first.
For discriminated unions, a couple relatively straightforward discriminated types are
MetricSpec
and MetricStatus
. These have clearly defined discriminator values
that map one-to-one to a member field, which make them good candidates for
initial migration.
For non-discriminated unions, there are a few relatively straightforward types
that make good candidates for initial migration, such as ContainerState
Until migrated, union types without a discriminator (i.e. only existing unions that have not been migrated to the current desgin), cannot be tagged with the go markers described above and thus will not be treated as "unions" in the sense of this currently proposed normalization and validation logic.
These legacy unions must continue to perform normalization and validation manually, per resource in the validation functions.
[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.
Core functionality will be extensively unit tested in the SMD typed package (union_test.go).
Parts of the kubernetes endpoints handlers package that are modified to call into the SMD code will also be unit tested as appropriate.
We will have extensive integration testing of the union code in the
test/integration/apiserver
package.
We will be testing along the dimensions of:
- Which fields of the union get modified (none, existing fields, newly updated fields)
- Type of union (discriminated vs non-discriminated)
- Whether the client is aware of all the fields
- Whether the server is aware of all fields
- Whether the union is optional or required
A fully documented test matrix exists in a google spreadsheet along with a guide doc on how to read and understand the test matrix.
As part of implementing the test matrix we will be able to prove the viability of upgrading existing unions by writing tests to mimic using the standardized union semantics on existing unions (even if actually upgrading these unions is outside the scope of alpha graduation)
We are considering adding kubectl e2e tests to mimic kubectl users performing various operations on objects with union fields.
- CRDs can be created with union fields and are properly validated when created/updated.
- Prove the viability of upgrading existing unions to the new semantics by mimicking existing unions in e2e tests.
- Existing unions that don't have discriminators do not break when upgraded.
Turning the flag on for alpha just enables different runtime codepaths (i.e. performing the unified union validation and normalization)
Any schema markers (added by CRD authors or propagated from tags on built-in types) will appear in the schema, but not do anything if the flag is off.
See test matrix and commentary about discriminators. It clearly documents how the server will use the discriminator to understand the client's intention even if the client is not aware of all union fields because of version skew.
Skew with alpha flag on/off shouldn't make much of a difference.
- Objects created with the union semantics, but applied to a cluster with the alpha flag off will simply not perform union validation and normalization.
- Objects created without union semantics will simply not trigger union validation and normalization (regardless of whether the server has the alpha enabled or disabled).
- Feature gate (also fill in values in
kep.yaml
)- Feature gate name: APIUnions
- Components depending on the feature gate: kube-apiserver
Request handlers in the api server will call into union validation and normalization function from the structured-merge-diff repo when feature is enabled.
Enabling the feature could cause existing CRs to fail validation if the correspond CRD has union fields and the existing CRs have invalid unions that were unvalidated when initially created in a cluster that had the unions feature disabled.
These CRs will need to be corrected in order to pass validation (or the feature disabled).
Yes, requests will simply skip union validation and normalization.
Custom resources that were skipping union validation when when the feature was rolled back may have allowed invalid data to persist.
For alpha, we require that all modifying requests (update/patch) fail unless the data passes union validation. Retrieving newly invalid CRs should still always succeed.
In the future, we may require looser "ratcheting validation" which would allow modifications to ignore union validation if the existing object fails the union validation for the same reason as the new object (see section on "Ratcheting Validation" above). This is not a priority for alpha.
We will have integration tests demonstrating how CRs with persisted invalid data will need to be corrected when the feature is re-enabled (and requires more strict union validation).
N/A
apiserver_request_total
could be watched to see if the number of create and update requests that are failing increase substantially.
N/A
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
N/A
For builtins in alpha, it won't be possible to break clients since turning on vs off will validate the same thing via different code paths.
For CRDs, you can see if they have the new union markers. If the CRD has no other validation mechanism, turning off the flag may result in CRs accepting invalid input.
- Create a new CRD with a union field (and no other validation mechanism)
- Apply the CRD
- Create a CR with an invalid union (multiple fields set, no discriminator set), see if the CR is rejected via union validation
When we write the e2e test, a standard union CRD and test CR will be obtainable for users to test on their instance.
N/A
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
N/A
Are there any missing metrics that would be useful to have to improve observability of this feature?
N/A
N/A
N/A
No
No
No
No
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
In GA (maybe beta), we might expect resource reduction/reliability improvement, since this removes a need for webhooks.
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
New validation and normalization logic should be negligible given that the functions will be in the same SMD path currently used by SSA code.
We will have benchmarking to validate this assumption.
N/A objects are not reachable
- Unions implemented, but disabled in SMD.
An issue that one might have with requiring a discriminator is that it might seem redundant to have to set a field and set another field indicating to the server to use the set field. The reasons for doing so are discussed above in the normalization section.
One other drawback is that our approach does not standardize all existing unions into a single format. We don't see a way to do so without drastically changing existing APIs and breaking backwards compatibility
The primary alternative discussed is to not have a discriminator for new union types. As discussed in the normalization section, requiring a discriminator allows the server to better understand the intentions of clients that do not have knowledge of all the fields in a union if newer versions of the server add new fields to the union.
A number of strategies were discussed around how to represent the "none" value of the discriminator (see "Discriminator Values" section above).
- One alternative was to mandate the "none" value always be the empty string. The advantage to this is its simplicity and not creating a situation where different API authors define there "none" value differently, so that anyone could immediately know that a discriminator set to "" (empty string), is not selecting any of the member fields. Also, it would allow us to not have to define the set of enum values for each discriminator (as we could just use the name of the member field). The disadvantage is that by not defining the set of enum values, we make it impossible to support the "empty union members" case.
- Another alternative was to make the discriminator a pointer to a string and its value nil. The disadvantage here is that this requires more complicated union validation logic (first do a nil check, then check the value) and makes it harder to determine client intent on patches where the discriminator is not set.
- A third alternative is to require all unions be defined in their own separate struct. This was rejected because there are many existing unions that define random fields that are not members in the union within the same struct as fields that do make up the union and we hope to be able to migrate at least some of the existing unions to the new semantics.