-
Notifications
You must be signed in to change notification settings - Fork 144
CBG-5357: removal of fast fail retry by default #8308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -121,6 +121,7 @@ type BucketSpec struct { | |
| ViewQueryTimeoutSecs *uint32 // the view query timeout in seconds (default: 75 seconds) | ||
| MaxConcurrentQueryOps *int // maximum number of concurrent query operations (default: DefaultMaxConcurrentQueryOps) | ||
| BucketOpTimeout *time.Duration // How long bucket ops should block returning "operation timed out". If nil, uses GoCB default. GoCB buckets only. | ||
| UseGOCBFastFailRetry bool // When true, gocb readiness checks and index lookups fail fast instead of using the best-effort retry strategy | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you think about doing something like: The other idea if this is confusing is to create an enum for types of strategies:
|
||
| } | ||
|
|
||
| const defaultNumRetries = 10 | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,15 +36,17 @@ type ClusterOnlyN1QLStore struct { | |
| collectionName string // Used to build keyspace for query when not otherwise set | ||
| supportsCollections bool | ||
| supportsIfNotExistsInDDL bool // 7.1.0+ MB-38737 | ||
| useGOCBFastFailRetry bool // When true, index lookups fail fast instead of using the best-effort retry strategy | ||
| } | ||
|
|
||
| func NewClusterOnlyN1QLStore(cluster *gocb.Cluster, bucketName, scopeName, collectionName string) (*ClusterOnlyN1QLStore, error) { | ||
| func NewClusterOnlyN1QLStore(cluster *gocb.Cluster, bucketName, scopeName, collectionName string, useGOCBFastFailRetry bool) (*ClusterOnlyN1QLStore, error) { | ||
|
|
||
| clusterOnlyn1qlStore := &ClusterOnlyN1QLStore{ | ||
| cluster: cluster, | ||
| bucketName: bucketName, | ||
| scopeName: scopeName, | ||
| collectionName: collectionName, | ||
| cluster: cluster, | ||
| bucketName: bucketName, | ||
| scopeName: scopeName, | ||
| collectionName: collectionName, | ||
| useGOCBFastFailRetry: useGOCBFastFailRetry, | ||
| } | ||
|
|
||
| major, minor, err := GetClusterVersion(cluster) | ||
|
|
@@ -209,10 +211,11 @@ func (cl *ClusterOnlyN1QLStore) runQuery(statement string, n1qlOptions *gocb.Que | |
|
|
||
| func (cl *ClusterOnlyN1QLStore) indexManager(scopeName, collectionName string) *indexManager { | ||
| return &indexManager{ | ||
| cluster: cl.cluster.QueryIndexes(), | ||
| bucketName: cl.bucketName, | ||
| scopeName: scopeName, | ||
| collectionName: collectionName, | ||
| cluster: cl.cluster.QueryIndexes(), | ||
| bucketName: cl.bucketName, | ||
| scopeName: scopeName, | ||
| collectionName: collectionName, | ||
| useGOCBFastFailRetry: cl.useGOCBFastFailRetry, | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We actually don't need this, because the RetryStrategy is inherited from the cluster and/or Bucket, so we can remove the complexity of this code and only set this for the Cluster/Bucket which makes this simpler. At this point of making index calls, we have already made a bucket connection and we don't care about surfacing an authenitcation error. |
||
| } | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -64,9 +64,10 @@ func (c *Collection) BucketName() string { | |
|
|
||
| func (c *Collection) indexManager() *indexManager { | ||
| m := &indexManager{ | ||
| bucketName: c.BucketName(), | ||
| collectionName: c.CollectionName(), | ||
| scopeName: c.ScopeName(), | ||
| bucketName: c.BucketName(), | ||
| collectionName: c.CollectionName(), | ||
| scopeName: c.ScopeName(), | ||
| useGOCBFastFailRetry: c.Bucket.Spec.UseGOCBFastFailRetry, | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See above, we should rely on RetryStrategy from the cluster/bucket. |
||
| } | ||
| if !c.IsSupported(sgbucket.BucketStoreFeatureCollections) { | ||
| m.cluster = c.Bucket.cluster.QueryIndexes() | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -101,16 +101,17 @@ func ExplainQuery(ctx context.Context, store N1QLStore, statement string, params | |
| } | ||
|
|
||
| type indexManager struct { | ||
| cluster *gocb.QueryIndexManager | ||
| collection *gocb.CollectionQueryIndexManager | ||
| bucketName string | ||
| scopeName string | ||
| collectionName string | ||
| cluster *gocb.QueryIndexManager | ||
| collection *gocb.CollectionQueryIndexManager | ||
| bucketName string | ||
| scopeName string | ||
| collectionName string | ||
| useGOCBFastFailRetry bool | ||
| } | ||
|
|
||
| func (im *indexManager) GetAllIndexes() ([]gocb.QueryIndex, error) { | ||
| opts := &gocb.GetAllQueryIndexesOptions{ | ||
| RetryStrategy: &goCBv2FailFastRetryStrategy{}, | ||
| RetryStrategy: goCBRetryStrategy(im.useGOCBFastFailRetry), | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I probably wrote this code, but we should rely on the cluster/bucket parameters and we can drop this line entirely, regardless of the state of this PR. |
||
| } | ||
|
|
||
| if im.collection != nil { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2436,6 +2436,10 @@ Startup-config: | |
| description: Store database configurations in system xattrs | ||
| type: boolean | ||
| default: false | ||
| use_gocb_fast_fail_retry: | ||
| description: When true, gocb cluster/bucket readiness checks and index lookups (on both the bootstrap and per-database connections) fail on the first error instead of retrying. When false, they use the best-effort retry strategy, retrying until their timeout when Couchbase Server is unavailable or failing over. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The only reason you want to enable this is if you expect to have authentication errors. I think it's worthwhile to call this out. Given this text, it seems low value to even expose this option and I can't imagine people really wanting to enable it. If we do have a flag, I'd be inclined to make this unsupported so we aren't committed to having this option for all eternity. The reasons that this can be useful:
It is possible that the RBAC user would be able to make a cluster connection but then not have bucket permissions, so it is possible to fail under each bucket. |
||
| type: boolean | ||
| default: false | ||
| readOnly: true | ||
| couchbase_keepalive_interval: | ||
| description: TCP keep-alive interval between SG and Couchbase server. This is unused. | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nit but I find the capitalization on this to be somewhat sanity inducing:
I'd prefer:
gocbRetryStrategytogoCBRetryStrategy.I think for this function, I would do something like:
func (cc *CouchbaseCluster) getRetryStrategy()to make this code easier to read.