KEP-4958: CSI Sidecars All In One

Release Signoff Checklist
Summary
Motivation
- Goals
- Non-Goals
Proposal
- User Stories (Optional)
  - Story 1
  - Story 2
- Notes/Constraints/Caveats (Optional)
Design Details
Production Readiness Review Questionnaire
Implementation History
Drawbacks
Alternatives
- Weighting Static Provisioning Scores and Dynamic Provisioning Scores
Infrastructure Needed (Optional)

Release Signoff Checklist

Items marked with (R) are required prior to targeting to a milestone / release.

(R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
[] (R) KEP approvers have approved the KEP status as implementable
(R) Design details are appropriately documented
[] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests for meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
(R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
(R) Production readiness review completed
(R) Production readiness review approved
"Implementation History" section is up-to-date for milestone
User-facing documentation has been created in kubernetes/website, for publication to kubernetes.io
Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

Summary

We propose to combine the source code of the CSI Sidecars in a monorepo, Instead of just putting the code repositories together, it is expected that the program entries for all sidecars will be consolidated. therefore we can:

Improve the CSI Sidecar release process by reducing the number of components released
Decrease the maintenance tasks the SIG Storage community maintainers do to maintain the Sidecars
Propogate changes in common libraries used by CSI Sidecars immediately instead of through additional PRs
Reduce the number of components CSI Driver authors and cluster administrators need to keep up to date in k8s clusters

As a side effects of combining the CSI Sidecars into a single component we also

Reduce the memory usage/API Server calls done by the CSI Sidecars through the usage of a shared informer.
Reduce the cluster resource requirements need to run the CSI Sidecars

Motivation

Increased maintenance tasks on components maintained by the SIG Storage community

The SIG Storage community maintains many storage related projects, each on its own git repo including:

CSI Drivers - SMB CSI Driver, NFS CSI Driver, Hostpath CSI Driver, ISCSI CSI Driver, NVMf CSI Driver
CSI Sidecars
- Typically deployed with the controller component of the CSI Driver: external-attacher, external-provisioner, external-resizer, external-snapshotter, external-health-monitor (alpha), livenessprobe
- Typically deployed with the node component of the CSI Driver: node-driver-registrar, livenessprobe
Controllers
- snapshot-controller, volume-data-source-validator (beta)
Webhooks
- csi-snapshot-validation-webhook
CSI libraries and utilities
- csi-lib-utils, csi-release-tools, csi-test, lib-volume-populator (beta)
Host binaries
- CSI Proxy As part of the maintenance work of these components the SIG Storage community:

Bumps the go runtime, Which usually fix vulnerabilities, then the application binary is rebuild and a new image is released. this is done in csi-release tools and propogated to the other repos(example) The effort is part of point #3 below.
Updates the dependencies to the latest version, which usually have new releases fixing vulnerabilities, the SIG Storage community reviewers/approvers look at every PR generated by a bot and LGTM/approve it. Because we have different repos the human effort is multiplied. e.g. review # dependencies * # CSI Sidecars PRs (example)
Propogates changes in CSI related dependencies across all the CSI sidecars and CSI Drivers that need them. csi-release-tools has common build utilities used across all the repos, whenever there's a change in this component it's need to be propogated across all the repos.(example). Because we have different repos the human effort is multiplied e.g. make (# updates in csi-release-tools + # new changes in csi-lib-utils) * # CSI Sidecars.

To keep dependencies up to date the SIG Storage community uses https://github.com/dependabot which is a bot that automatically creates a PR whenever a dependency creates a new release. As a side effect, after enabling the bot the number of PRs increased. Also note that because each component is on its own repo a bump in a dependency(assuming that the dependency is shared among many CSI Sidecars) is multiplied accross of them.

Stats for dependency/vuln updates across CSI Sidecars as of Aug 11th, 2023.

CSI Sidecar \ PRs reviewed & merged	Dependabot dependency update	csi-release-tools propagation	csi-lib-utils
external-attacher	14(unreleased) 12 (release 4.3.0) 8 (release 4.2.0)	2 (unreleased)~71 (lifetime)	~15 (lifetime)
external-provisioner	36 (unreleased) 30 (release 3.5.0) 11 (release 3.4.0)	2 (unreleased)~75 (lifetime)	~19 (lifetime)
external-resizer	5 (release 1.8.0) 5 (release 1.7.0)	2 (unreleased)~62 (lifetime)	~10 (lifetime)
external-snapshotter	14 (unreleased)	~90 (lifetime)	~19 (lifetime)
node-driver-register	13 (unreleased) 8 (release 2.8.0) 2 (release 2.7.0) 3 (release 2.6.0)	~70 (lifetime)	~7 (lifetime)
livenessprobe	9 (unreleased)	~41 (lifetime)	~9 (lifetime)

Table: PR to CSI Sidecars related to vuln fixes and library propagation

CSI Sidecars releases

The CSI Drivers/CSI Sidecars have an indirect dependency on the k8s version. This could happen because of:

A new CSI feature that touches CSI Sidecars and k8s component - For example the ReadWriteOncePod feature needs changes in k8s components (kube-apiserver, the kube-scheduler, the kubelet), CSI Sidecars

Because of this indirect dependency the SIG Storage community creates a minor release of each CSI Sidecar for every k8s minor release. We use csi-hospath (a CSI Driver used for testing purposes) to test the compatibility of the new releases with the latest k8s version.

We follow the instructions on SIDECAR_RELEASE_PROCESS.md on every CSI Sidecar to create a minor release.

Maintenance tasks by CSI Driver authors and cluster administrators

Kubernetes and CSI are constantly evolving（see the section above on how CSI Sidecars evolve）and so are CSI Drivers, CSI Driver authors must keep their drivers up to date with the new features in k8s and CSI. A CSI Driver implementing most of the CSI features inludes the following components:

keeping up with vulnerabilities with fixes

A cluster administrator in addition to keeping up with the latest k8s and CSI features might need to manage different aspect of the integration too like security. CSI Sidecars depend on multiple dependencies which might be susceptible to vulnerabilities. In the case these vulnerabilities are fixed in a new release of a dependency it must be propagated all the way until the CSI Sidecar repository.

Usually the above might be enough for the latest release however the vulnerability might also affest older releases of the CSI Sidecars, therefore the fix needs to be appliedto older CSI Sidecar releases

The above increases the work not only for the SIG Storage community which has to cherry pick the fix but also to cluster administrators who have to update existing CSI Driver integrations in previous k8s releases bumping the CSI Sidecars

To avoid this propogation issue, cluster administrators have the following options:

Use the same version of CSI Sidecars in previous k8s integrations

Resource utilization by the CSI Sidecar components

In Some CSI Driver control plane deployment setups each sidecar is configured with a minimum memory request, some examples of OSS CSI Driver deployments resource allocations:

Memory request
- EBS CSI Driver
  - In a CP node, sets a 40Mi memory request for each CSI Sidecars(5 sidecars), a total of 200Mi per node.
  - In a worker node, sets a 40Mi memory request for each CSI Sidecar(2 sidecars), a total of 80Mi per node
- Azuredisk
  - In a CP node, sets a 20Mi memory request for each CSI Sidecars(5 sidecars), a total of 100Mi per node
  - In a worker node, sets a 20Mi memory request for each CSI Sidecars(2 sidecars), a total of 40Mi per node
- AlibabaCloud Disk
  - In a CP node, sets a 16Mi memory request for each CSI Sidecars(average 4 sidecars) a total of 64Mi per node
  - In a worker node, sets a 16Mi memory request for each CSI Sidecars(1 sidecars), a total of 40Mi per node The 5x memory request is addtional overhead in the control plane nodes, 2x in the worker nodes

Goals

To combine the source code of the CSI Sidecars in a monorepo.
To comnine the entrance of CSI Sidecars in one binary.
- If we just merge the source code, we won't be able to reuse resources and realize the above advantages
- To minimize impact on users, we can't seperate the whole migration process in to two steps.(merge source code and merge the entrance)
The sidecars includees the following:
Retain git history logs of sidecars in new monorepo.

Non-Goals

The sidecars not include sig-storage-lib-external-provisioner.
- Because it doesn't depend on release-tools or csi-lib-utils.
release-tools and csi-lib-utils are not included in the monorepo.
- we can start with the sidecars only and no utility libraries, after we see that it works in CI then we can consider moving the utilities to the monorepo. we will open another KEP if we need to move them.

Proposal

Overview

The proposal consists of creating a monorepo which creates a single artifact with common sidecars combined in one binary:

Combine the source code of all common CSI sidecars (external-attacher, external-provisioner, external-resizer, external-snapshotter, livenessprobe, node-driver-registrar), Controllers(snapshot controller, volume-health-monitor controller), Webhooks(csi-snapshot-validation-webhook) in a single repository. A total of 7 repositories including 6 sidecars, 2 controllers and 1 webhook.
Include the source code of helper utilities in the same repository(csi-release-tools, csi-lib-utils), sidecars/apps use the local modules through go workspaces. A total of 1 release helper and 1 go module.
Create a new cmd/ entrypoint that enables sidecars selectively, similar to kube-controller-manager and the --controllers flag.

CSI Driver authors would include a single sidecar in their deployments(in both the control plane and node pools). while the artifact version is the same, the command/arguments will be differents.

pictures:

The CSI Driver deployment manifest would look like this in the control plane:

kind: Deployment
apiVersion: app/v1
metadata:
  name: csi-driver-deployment
spec:
  replicas: 1
  templates:
    spec:
      containers:
        - name: csi-driver
          args:
            - "--v=5"
            - "--endpoint=unix:/csi/csi.sock"
        - name: csi-sidecars
          command:
            - csi-sidecars
            - "--csi-address=unix:/csi/csi.sock"
            # similar style as kube-controller-manager
            - "--controllers=attacher,provisioner,resizer,snapshotter"
            - "--feature-gates=Topology=true"
            # leader election flags for all the components as one
            - "--leader-election"
            - "--leader-election-namespace=kube-system"
            # global timeouts
            - "--timeout=30s"
            # per controller specific flags are prefixed with the component name
            - "--attacher-timeout=30s"
            - "--attacher-worker-thread=100"
            - "--provisioner-timeout=30s"
          volumeMounts:
            - mountPath: /csi
              name: socket-dir

The CSI Driver deployment manifest would look like this in the worker node

kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: csi-driver-deployment
spec:
  template:
    spec:
      containers:
        - name: csi-driver
          args:
            - "--v=5"
            - "--endpoint=unix:/csi/csi.sock"
        - name: csi-sidecars
          command:
            - csi-sidecars
            - "--csi-address=unix:/csi/csi.sock"
            # similar style as kube-controller-manager
            - "--controllers=node-driver-registrar"
            - "--kubelet-registration-path=/var/lib/kubelet/plugins/<csi-driver>/csi.sock"
          volumeMounts:
            - name: registration-dir
              mountPath: /registration
            - name: plugin-dir
              mountPath: /csi
      volumes:
        - name: registration-dir
          hostPath:
            path: /var/lib/kubelet/plugins_registry/
            type: Directory
        - name: plugin-dir
          hostPath:
            path: /var/lib/kubelet/plugins/<csi-driver>/
            type: DirectoryOrCreate

Quantifiable characteristics of the current state and of the proposed state

Characteristics/State	Current state of CSI Sidecars(let #csi-sidecars=6)	CSI Sidecars in signal component
Human effort of propogating csi-release-tools	(#csi-release-tools changes * #csi-sidecars)	0(because csi-release-tools is part of the repo)
Human effort of propogating csi-lib-utils	(#csi-lib-utils changes * #csi-sidecars)	0(because csi-lib-utils is part of the repo)
go mod dependency bumps	(#dependency changes * #csi-sidecars) * CSI release supported(unknown)	#dependency changes * releases supported(follow k8s release)
runtime udpate	(#csi-release-tools changes related with go runtime updates * #csi-sidecars)	#go runtime updates
members of CSI releases per k8s minor release	#csi-sidecars	1

Additional properties of a single CSI Sidecar component without a quantifiable benefit:

Dimension	Pros	Cons
Releases	Easier releases Better definition of which sidecar releases are supported for CVE fixs i.e. if our model of support is similar to k8s (last 3 releases) then the same applies to the CSI sidecar releases Release nodes in csi-release-tools are part of the release. Currently, commits in csi-release-tools with release notes get lost because the git subtree commands replays commits but loses the PR release note if csi-release-tools is part of the repo	No longer able to do single releases per component. More frequent major version bumps, Currently, we increase the major version of a sidecar when we remove a command line parameter or require new RBAC rules, We ended up with provisioner v5, attacher v4, and snapshotter v8. With a common repo, we would end up with 5+4+8=v17 in the worst case.
Testability	Easier testing Test features that spawn multiple components e.g. the RWOP feature can be tested as a whole. @pohly
Performance & Reliability	Can use a shared informer decreasing the load on the API server. @msau42	Container getting OOMKilled kills the entire CSI machinery, not just a single component. In HA, another replica would take over a few seconds.
Simplicity	Consolidation of common parameters like leader election, structed logging Avoids duplication of some features e.g. structured logging would be implemented only once instead of #csi-sidecar times) @msau42 Combination of metrics/health ports @msau42 Enables using additional sidecars that aren't used because of addtional build pipelines that might be needed to support that additional component.	Logs would be interleaved making it harder to trace what happened for a request CSI utility liraries that are not only used by CSI Sidecars but by other project. make an external repo which is automatically syncronized from the internal csi-release-tools e.g. a similar analogy to k/k/staging/lib -> k/lib
Integration with CSI Drivers	Less config in the controller/node yaml manifest Less confusion for CSI Driver authors on which CSI Sidecar versions to use @msau42	Complex configuration for the single CSI Sidecar component Difficulty expressing per CSI Sidecar configration e.g. kube-api-qps, kube-qpi-burst global flag, override through a CSI sidecar flag e.g. kube-api-qps -> attacher-kube-api-qps

User Stories (Optional)

Notes/Constraints/Caveats (Optional)

Design Details

Glossary

Individual repository - An existing repository in the kubernetes-csi/ org in Github e.g. the external-attacher repository.
Individual component - An existing component of csi sidecars.
AIO monorepo or monorepo - The monolithic repository where most of the code of the CSI Sidecars will be migrated.
Monorepo component - The source code of an individual repository that is currently being migrated or already migrated to the monorepo.

AIO Monorepo

Release Management

We are consider to switch semantic version to k8s version, there are some pros and cons

pros:

We don't need to reinvent the wheel about what our dev process is going to look like, we follow the same docs as k8s https://kubernetes.io/releases/release/. This is tried and tested for many releases
Cluster administrators would know which version to use to match their CSI Driver deployment e.g. for a k8s 1.27 cluster they'd use the 1.27 release of the CSI Sidecar.

cons:

Breaking changes might happen in a minor release, Cluster administrators MUST read sidecar release notes considering breaking changes before working on a big release.
Version skew scenario becomes confusing for the cluster administrator e.g. they deploy the CSI Sidecars v1.x, cluster is upgraded to v1.{x+3} (CP upgrade first, NP later), nodepools would have CSI sidecar at v1.{x+3} with kubelet at v1.x
k/k at 1.27.5 - CSI 1.27.0 or (different mapping still)

After investigation, we found that there isn't clear advantage to switch to k8s versioning, so we chose to keep Semantic Versioning in monorepo.

RBAC policy

We designed the AIO repo's RBAC policy to mirror that of individual repos, where each controller maintains its own policy. Driver maintainers should apply proper RBAC when enabling specific controllers in AIO more discuss info in

We plan to combine informer caches of different controllers in the future

Command Line

Divided the command lines into two types, a generic command line whose configuration is common to all controllers and is configured only once, and the other type of command lines whose configuration is different for each controller. these command lines each has a new unique name. prefix with the controller name.

        - name: csi-sidecars
          command:
            - csi-sidecars
            - "--csi-address=unix:/csi/csi.sock"
            # similar style as kube-controller-manager
            - "--controllers=attacher,provisioner,resizer,snapshotter"
            - "--feature-gates=Topology=true"
            # leader election flags for all the components as one
            - "--leader-election"
            - "--leader-election-namespace=kube-system"
            # global timeouts
            - "--timeout=30s"
            # per controller specific flags are prefixed with the component name
            - "--attacher-timeout=30s"
            - "--attacher-worker-thread=100"
            - "--provisioner-timeout=30s"

example PR: kubernetes-csi/external-attacher#620

Monorepo component

poc version: https://github.com/mauriciopoppe/csi-sidecars

monorepo attacher: https://github.com/mauriciopoppe/csi-sidecars/tree/main/pkg/attacher

Development workflow

After we see the Monorepo component running fine in integration/e2e tests in k8s, we need to perform a hard cut so that new deployment goes in the monorepo component only.

AIO MonoRepo state definition

Design: Current state of AIO MonoRepo
Alpha: all six sidecar repo had been integrated into mono repo, All the e2e tests has passed.
Beta (production-verified): six sidecars working through CSI hostpath, three cloud vendor can using it in its production environment.
GA (released): Official released, Available for accept PRs from SIG Storage Developer
standalone: Never need sync codes from individual repos, AIO MonoRepo become the source of truth

Individual repository state definition

Released: current state of individual repos
FeatureFreeze:
- Any new feature PRs are not allowed to be filed to the master branch or release-X branches(Controlled by the individual repo maintainer, categorize it and reject it if it's a feature)
- SIG Storage Developer file the feature PRs to AIO MonoRepo
- Except for the serious bugfixes or CVE fixes PRs (only from individual repo maintainer) which can be merged in master and backported to the other release-X branches
Deprecated:
- Not maintaining this repository
- Eventually the image is going away for the individual repo is going away (although wouldn't possible unless we migrate ALL the sidecars)
- (future) archive it but not at the same time as the deprecation time, this is a terminal state so we can't undo it

Migration Process

Risks And Mitigations

Breaking changes in one component forces the single release to be a breaking change
Vulnerability that might affects one component affects all other components

see details in: https://docs.google.com/document/d/1SD4YRas_qXMP363L4j3WBTV_F9anq-5FM5gdGmJq7h0/edit?usp=sharing

Panic in one component restarts the sidecar

For each sidecar define the where in the stack a panic should be caught to possibly restart the controller.

List of fixed issues related with panics: - kubernetes-csi/external-provisioner#839 - kubernetes-csi/external-provisioner#582 - kubernetes-csi/external-attacher#502

panic like OOM doesn't count into this type(perhaps no good way to reduce the blast radius)

Keeping the monorepo and the existing sidecars repo up to date after the migration for X releases

MileStone

Milestone (completed):

Develop a minimal proof of concept

POC: https://github.com/mauriciopoppe/csi-sidecars-aio-poc

Milestone-setup-a-repository-inside-kubernetes-csi

Design phase

Milestone-Build-the-project-using-a-modified-copy-of-release-tools

Design phase

Milestone-set-up-new-test-infra-jobs-to-test-the-project-through-the-hostpath-CSI-Driver

Design phase

Milestone-mirroring-of-nested-directories-to-repos-in-kubernetes-csi

Design phase

Milestone-definition-of-the-development-workflow

Design phase

Milestone-migration-of-CSI-Drivers-to-the-new-model

Design phase

Milestone-all-six-sidecar-repo-had-been-integrated-into-monorepo

Alpha phase

Milestone-be-ready-to-accept-PR-from-community

Beta phase

Milestone-six-sidecars-working-through-CSI-hostpath

Beta phase

Milestone-three-cloud-vendors-start-using-the-monorepo-component

GA phase

Milestone-all-individual-repo-has-been-into-deprecated-state

Standalone phase

Milestone-merge-sidecar-informer-caches

Standalone phase

Test Plan

Prerequisite testing updates

Unit tests

Integration tests

e2e tests

Graduation Criteria

Upgrade / Downgrade Strategy

Version Skew Strategy

Production Readiness Review Questionnaire

Feature Enablement and Rollback

How can this feature be enabled / disabled in a live cluster?

It's actually not a feature, but we can enable it by deploy new version of csidriver and disable it by delete the new version and redeploy the old version

Does enabling the feature change any default behavior?

This won't make any changes to the default behavior of Kubernetes.

Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

It's actually not a feature, it's kind of architectural change. so user can deploy old version csi driver to disable it.

What happens if we reenable the feature if it was previously rolled back?

Nothing happend, it will act as usually

Are there any tests for feature enablement/disablement?

Yes. We will add unit tests with and without the feature gate enabled.

Rollout, Upgrade and Rollback Planning

How can a rollout or rollback fail? Can it impact already running workloads?

What specific metrics should inform a rollback?

Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

Monitoring Requirements

How can an operator determine if the feature is in use by workloads?

How can someone using this feature know that it is working for their instance?

Events
- Event Reason:
API .status
- Condition name:
- Other field:
Other (treat as last resort)
- Details:

What are the reasonable SLOs (Service Level Objectives) for the enhancement?

What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?

Metrics
- Metric name: plugin_execution_duration_seconds{plugin="VolumeBinding",extension_point="Score"}
- [Optional] Aggregation method:
- Components exposing the metric:
Other (treat as last resort)
- Details:

Are there any missing metrics that would be useful to have to improve observability of this feature?

Nothing in particular.

Dependencies

Does this feature depend on any specific services running in the cluster?

No.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!