Skip to content
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 52 additions & 16 deletions docs/hugo/content/design/ADR-2025-05-Version-Priority/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ toc_hide: true

As initially identified in [#4147](https://github.com/Azure/azure-service-operator/issues/4147), we have a problem with the way we version resources.

If a specific resource isn't specified, Kubernetes will automatically select one for use through a process called [_version priority_](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#version-priority).
If a specific resource isn't specified, Kubernetes will automatically select one for use through a process called [_version priority_](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#version-priority).

Unfortunately, the way we are currently constructing our versions isn't playing well with this algorithm.

When ASO constructs a resource version number, it uses a constant prefix followed by the Azure api-version of the resource.
When ASO constructs a resource version number, it uses a constant prefix followed by the Azure api-version of the resource.

Beta releases of ASO used the constant prefix `v1beta` giving rise to resource versions such as:

Expand All @@ -23,14 +23,14 @@ These are compliant with how Kubernetes defines versions, and the system would c
When we went GA, we changed the prefix to `v1api` (removing the `beta` tag), giving resource versions such as:

* `v1api20210201`
* `v1api20240201` 
* `v1api20240201`
* `v1api20210201preview`

While `beta` is in the list of identifiers permitted in Kubernetes versions, `api` is not, so we've inadvertently ended up with version numbers that are not compliant with Kubernetes rules.
However, while `beta` is in the list of identifiers permitted in Kubernetes versions, `api` **is not**, so we've inadvertently ended up with version numbers that are not compliant with Kubernetes rules.

The Kubernetes _version priority_ algorithm gives priority to compliant versions, and sorts the remaining versions alphabetically.
The Kubernetes _version priority_ algorithm gives priority to compliant versions, and sorts the remaining versions alphabetically.

With all `v1api` versions considered non-compliant, they're sorted alphabetically and then the first one selected - resulting in automatic selection of the *oldest version*, not the newest.
With all `v1api` versions considered non-compliant, they're sorted alphabetically and then the first one selected - resulting in automatic selection of the _oldest version_, not the newest.

## Requirements

Expand All @@ -51,20 +51,23 @@ Leave things the way they are.
* We've already had a couple cases where version priority caused issues for users, so we know this problem isn't going away.
* As we add more resource versions, the scope for this occurring can only increase.

## Option 2: Cut over to versioning format
## Option 2: Change all resources to use a new versioning format

Change all generated API versions to be compliant with Kubernetes version rules.

### Pros

* Version priority will work as users expect.

### Cons

* Would be breaking for all existing users.
* Would be breaking for all existing users as their existing custom resources would no longer be valid.

## Option 3: Use a new version format for new resources only

## Option 3: Migrate to new versioning format over time
Introduce a new version format as from a particular release of ASO, leaving existing resources with the existing format.

Introduce a new version format as from a particular release of ASO, leaving existing resources with the existing format.
Any resources introduced with older versions of ASO would continue to use the existing format, ensuring existing resources would continue to work, but new resources (and new versions of existing resources) would use the new format.

### Pros

Expand All @@ -75,21 +78,54 @@ Introduce a new version format as from a particular release of ASO, leaving exis

* Still allows the problem to occur (but prevents it from getting worse)

## Version format
## Option 4: Introduce a new version format for all resources with old version duplicates for existing resources

Reuse the approach we took as we migrated from `v1alpha1` to `v1beta` and from `v1beta` to `v1api`, where we introduce a new version format for all resources, but also keep the old versions around for compatibility reasons.

After several releases of ASO, we'd remove the old versions, as we did with `v1alpha1` and `v1beta`.

### Pros

* Migrates all resources to the new version format, so version priority will work as users expect for all resources.
* Mitigates the breaking change of Option 2 by keeping the old versions around for a while.

### Cons

* Will eventually a breaking change for some users, but only for those who have not migrated to the new version format in the interim.
* May not be possible for some resources, as the duplication may result in the aggregate CRD being too large. (This is definitely the case for `ManagedCluster` which is already near the maximum size for a CRD, and would not be able to accommodate the duplication of all the existing versions.)

## Which new version format?

A simple change to the format we use for resource versions would be to simplify the prefix used to just `v`, giving resource versions like this:

* `v20210201`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to do v<date>1 or v1<date> (we'd probably want v1<date> so that updated ASO schemas result in those versions all "winning"), which results in a somewhat awkward looking number but maintains all of the ordering guarantees we want.

Though, I suppose the argument could be made that if we ever want to make a big change like that, we could still do it and just introduce v1<date> at that time, and since 1<date> is larger than <date> it'll take precedence, and avoids us needing to commit to the awkward 1 now, only if we ever end up needing it?

I think I'm fine with that as the approach but it would be good to call out some of that logic in the ADR for future-us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, including discussion on why it's not a good idea.

* `v20240201` 
* `v20240201`
* `v20210201preview`

Unfortunately, multi-part versions aren't supported for Kubernetes custom resources, so we *can't* lean into the date-based scheme used for Azure api-versions and use `.` separators as follows:
Unfortunately, multi-part versions aren't supported for Kubernetes custom resources, so we _cannot_ lean into the date-based scheme used for Azure api-versions and use `.` separators:

* `v2021.02.01`
* `v2024.02.01` 
* `v2024.02.01`
* `v2021.02.01.preview`

# Status
If we wanted to leave the door open for making significant breaking changes to the generated resources, we could consider adding a numerical prefix or suffix to the version, such as:

* `v120210201`
* `v120240201`
* `v120210201preview`

or (using a suffix):

* `v202102011`
* `v202402011`
* `v20210201preview1`

This would seem to be quite awkward, difficult for users to understand, and potentially for zero benefit. Maybe YAGNI applies.

Recommendation: Option 3 with the proposed version change.
## Status

Recommendation:
_Either_
Option 3 with the proposed version change to use just `v` as the prefix.
_or_
Option 4 with the proposed version change to use just `v` as the prefix, but with the understanding that some resources may not be able to accommodate the duplication of existing versions.
Loading