Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosi v1alpha2 changes #4599

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

BlaineEXE
Copy link

@BlaineEXE BlaineEXE commented Apr 25, 2024

  • One-line PR description: begin v1alpha2 doc - initial work is fixing old errors without adding large features
  • Other comments:

Update the COSI KEP's base design to v1alpha2. The primary purpose of
this work is to reshape the doc and spec to fix inconsistencies and
vestigial items, and fix the most glaring issues in the v1alpha1 spec.

COSI maintainers are planning follow-up items to add larger individual
features after this.

Notably, bucketInfo.json has been changed to individual secret fields
with COSI_: format, as the JSON blob was flagged as a
problem by several v1alpha1 users.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 25, 2024
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 25, 2024
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch 3 times, most recently from d22d1bd to 511d6cc Compare April 25, 2024 22:01
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from 511d6cc to 5aa494b Compare April 25, 2024 23:26
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 25, 2024
@BlaineEXE BlaineEXE marked this pull request as draft April 25, 2024 23:27
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 25, 2024
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from 5aa494b to 52275b7 Compare April 25, 2024 23:40
@pacoxu pacoxu mentioned this pull request Jul 25, 2024
8 tasks
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 23, 2024
@xing-yang
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 23, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: BlaineEXE
Once this PR has been reviewed and has the lgtm label, please assign saad-ali for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment on lines 274 to 277
The mechanism by which the Controller waits for the intermediate Bucket to be provisioned in the
middle of BucketClaim reconciliation is not specified here. The important behavior (waiting) is
defined, and the logic coordinating the init-wait-finish reconcile process is left as an
implementation detail.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better to define high-level behavior expectations here, rather than pigeonhole ourselves into a specific implementation solution.

@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from f55b3d3 to 11d78f4 Compare February 24, 2025 23:10
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from 9098c29 to 14b42fa Compare March 11, 2025 22:11

COSI is out-of-tree, so version skew strategy is N/A

## Alternatives Considered
Copy link
Author

@BlaineEXE BlaineEXE Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section contains sections that describe design details that we have discarded, including many changes between v1alpha1 and v1alpha2. I have taken care to try to make this a good effective overview and discussion of any complex points.

@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from 14b42fa to 8c2312f Compare March 21, 2025 22:20
@BlaineEXE BlaineEXE marked this pull request as ready for review March 21, 2025 22:20
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 21, 2025
Comment on lines +507 to 515
// Protocols is a list of protocols that the provisioned Bucket must support. COSI will verify
// that each item in this list is advertised as supported by the OSP driver.
Protocols []Protocol
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I listened to @saad-ali 's response in the last sig-storage meeting to whether this is needed. One of the important things I heard was that the intent is that drivers can "advertise" what they support.

Reading more into Saad's reply, I can see how this field might be important for portability in some ways. Importantly, if I were to copy a BucketClaim spec to a new/different K8s cluster with different COSI driver, the presence of Protocols here should help me feel certain that COSI will return an error if the ported copy is incompatible with the new k8s cluster.

Something that wasn't present in the v1alpha1 spec (or implementation) was a way for drivers to "advertise" anything. Drivers were expected to error out if they couldn't support all protocols, but I don't consider this "advertising" things. And certainly, any driver could decide to ignore the list when provisioning if they wanted to.

Thus, for the v1alpha2 spec, I added a gRPC return value for DriverGetInfo for drivers to advertise their supported protocols. I also added the note here that we expect COSI itself (I think the Sidecar) to validate that what the user is requesting in Parameters is advertised as supported by the driver.

If my analysis and understanding of Saad's feedback is correct, I think this is good to keep (along with the ability for to drivers advertise support).


As an alternative to the input check, we originally planned v1alpha2 to only have the BucketClaim.status.supportedProtocols list, which would inform users what protocols the Bucket supports, but only after the BucketClaim/Bucket are done being provisioned. It wouldn't guarantee an error in ported BucketClaim provisioning if the 'new' cluster didn't support the same protocols.

This alternative would easier to implement because it's less complex. However, it wouldn't allow them to state their intentions before provisioning happens.

CC: @xing-yang @shanduur

@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from 8c2312f to e4f1a94 Compare March 21, 2025 22:38

COSI Sidecar should not have Bucket delete permissions.

#### Generating Bucket Access Credentials
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we are trying to keep the Sidecar responsibilities (and RBAC) as small as possible (following the design of Volume Snapshotter), we needed significant changes to how BucketAccesses are provisioned.

TL;DR: the controller gets info from BucketAccessClass and BucketClaim and copies it to the BucketAccess.status, after which the sidecar can do its thing. Sidecar no longer does Class/Claim inspection.

Comment on lines +655 to +659
// Protocol is the name of the Protocol that this access credential is expected to support.
// +required
Protocol Protocol
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Protocols is important for BucketClaims/Buckets it stands to reason (to me) that Protocol is important for BucketAccess also.

@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from e4f1a94 to da319fe Compare March 21, 2025 22:46
Comment on lines 147 to 149
### Important changes between versions

NOTE:
- The secret will not be created until the credentials are generated/service account mappings are complete.
- Within a namespace, one BucketAccess and secret pair is generally sufficient, but cases which may want to control this more granularly can use multiple.
- The secret will be created with a finalizer that prevents it from being deleted until the associated bucketAccess is deleted.
v1alpha1 to v1alpha2:
Copy link
Author

@BlaineEXE BlaineEXE Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section has a quick overview of important changes from v1alph1 to v1alpha2.

Update the COSI KEP's base design to v1alpha2. The primary purpose of
this work is to reshape the doc and spec to fix inconsistencies and
vestigial items, and fix the most glaring issues in the v1alpha1 spec.

COSI maintainers are planning follow-up items to add larger individual
features after this.

Notably, bucketInfo.json has been changed to individual secret fields
with COSI_<KEY>: <VALUE> format, as the JSON blob was flagged as a
problem by several v1alpha1 users.

Signed-off-by: Blaine Gardner <[email protected]>
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from e17381a to 586add2 Compare March 27, 2025 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants