update troubleshooting guide for sidecar bucket access check feature#1153
update troubleshooting guide for sidecar bucket access check feature#1153Sneha-at merged 5 commits intoGoogleCloudPlatform:mainfrom
Conversation
|
@Sneha-at: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
5ac0244 to
dfebf0e
Compare
dfebf0e to
44a0398
Compare
docs/troubleshooting.md
Outdated
|
|
||
| ### Solution | ||
| Ensure bucket access is verified before the the pod attempts to mount the volume. This can be achieved through following | ||
| 1. Set `skipCSIBucketAccessCheck:false` through [volume attribute class](https://docs.cloud.google.com/kubernetes-engine/docs/reference/cloud-storage-fuse-csi-driver/volume-attr). This performs bucket access check before attempting to mount the volume, however, at high scale workloads might experience STS quota exhaustion issues due number of access verification calls. The below method is recommended for high scale workloads (it offers 50% reduction in STS quota consumption for GCS Fuse CSI driver) |
There was a problem hiding this comment.
It still isn't clear why a customer would see error #1 or #2 in the first place. I think something is wrong with their setup if they get this error, so the solution should be to fix the access issues. Not sure skipCSIBucketAccessCheck:false should be a solution for these errors as it just makes it fail in a different place. We are also trying to deprecate skipCSIBucketAccessCheck so I don't think having solutions telling customers to use this makes sense. Please add those solutions for these problems.
I think you would also need another section with an error showing STS quota exaustion. Then the solution for that problem would be either skipCSIBucketAccessCheck: true , or update to 1.34.1-gke.3899001+ . Explain the hostnetwork limitation that updating to 1.34.1-gke.3899001 isn't a feasible solution for hostnetwork, so for that solution they still need to do skipCSIBucketAccessCheck: true to resolve the quota exhaustion (but warn they may see the mount errors you discussed above in (Mounting issues due to bucket access verification failure) if the access isn't properly setup
There was a problem hiding this comment.
Added a separate section here
docs/troubleshooting.md
Outdated
| ### Limitation | ||
| We have noticed a gap in the implementation for the sidecar bucket access check feature specified above due to which the GCS Fuse sidecar container fails to retry if metadata server is not yet up. This means the solution will resolve issues (2) and (3) but not (1). | ||
|
|
||
| This gap is being fixed in [PR](https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/pull/1261) and will soon be released. Meanwhile, please follow the mitigation and deploy the sidecar as a private sidecar container image. Please not the feature will still be enabled if the GKE public image from gcr.io/gke-release/gcs-fuse-csi-driver-sidecar-mounter is used. |
There was a problem hiding this comment.
please rephrase "Please not the feature will still be enabled"
There was a problem hiding this comment.
Rephrased this, please check
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: saikat-royc, Sneha-at The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
What type of PR is this?
/kind documentation
What this PR does / why we need it:
This adds information about bucket access feature in the troubleshooting guide. This feature is only available in managed sidecar images.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: