You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/specifications/providers.md
+68-42Lines changed: 68 additions & 42 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,18 +64,21 @@ stateDiagram-v2
64
64
NOT_READY --> ERROR: initialize
65
65
READY --> ERROR: disconnected, disconnected period == 0
66
66
READY --> STALE: disconnected, disconnect period < retry grace period
67
+
READY --> NOT_READY: shutdown
67
68
STALE --> ERROR: disconnect period >= retry grace period
69
+
STALE --> NOT_READY: shutdown
68
70
ERROR --> READY: reconnected
69
-
ERROR --> [*]: shutdown
71
+
ERROR --> NOT_READY: shutdown
72
+
ERROR --> [*]: Error code == PROVIDER_FATAL
70
73
71
-
note right of STALE
74
+
note left of STALE
72
75
stream disconnected, attempting to reconnect,
73
76
resolve from cache*
74
77
resolve from flag set rules**
75
78
STALE emitted
76
79
end note
77
80
78
-
note right of READY
81
+
note left of READY
79
82
stream connected,
80
83
evaluation cache active*,
81
84
flag set rules stored**,
@@ -84,7 +87,7 @@ stateDiagram-v2
84
87
CHANGE emitted with stream messages
85
88
end note
86
89
87
-
note right of ERROR
90
+
note left of ERROR
88
91
stream disconnected, attempting to reconnect,
89
92
evaluation cache purged*,
90
93
ERROR emitted
@@ -101,25 +104,47 @@ stateDiagram-v2
101
104
102
105
### Stream Reconnection
103
106
104
-
When either stream (sync or event) disconnects, whether due to the associated deadline being exceeded, network error or any other cause, the provider attempts to re-establish the stream immediately, and then retries with an exponential back-off.
105
-
We always rely on the [integrated functionality of GRPC for reconnection](https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md) and utilize [Wait-for-Ready](https://grpc.io/docs/guides/wait-for-ready/) to re-establish the stream.
106
-
We are configuring the underlying reconnection mechanism whenever we can, based on our configuration. (not all GRPC implementations support this)
107
+
When either stream (sync or event) disconnects, whether due to the associated deadline being exceeded, network error or any other cause, the provider attempts to re-establish the stream immediately.
108
+
Both the RPC and sync streams will forever attempt to reconnect unless the stream response indicates a [fatal status code](#fatal-status-codes).
109
+
This is distinct from the [gRPC retry-policy](#grpc-retry-policy), which automatically retries *all RPCs* (streams or otherwise) a limited number of times to make the provider resilient to transient errors.
107
110
108
-
| language/property | min connect timeout | max backoff | initial backoff | jitter | multiplier |
flagd leverages gRPC built-in retry mechanism for all RPCs.
114
+
In short, the retry policy attempts to retry all RPCs which return `UNAVAILABLE` or `UNKNOWN` status codes 3 times, with a 1s, 2s, 4s, backoff respectively.
115
+
No other status codes are retried.
116
+
The flagd gRPC retry policy is specified below:
118
117
119
-
When disconnected, if the time since disconnection is less than `retryGracePeriod`, the provider emits `STALE` when it disconnects.
120
-
While the provider is in state `STALE` the provider resolves values from its cache or stored flag set rules, depending on its resolver mode.
121
-
When the time since the last disconnect first exceeds `retryGracePeriod`, the provider emits `ERROR`.
122
-
The provider attempts to reconnect indefinitely, with a maximum interval of `retryBackoffMaxMs`.
118
+
```json
119
+
{
120
+
"methodConfig": [
121
+
{
122
+
"name": [
123
+
{
124
+
"service": "flagd.sync.v1.FlagSyncService",
125
+
"service": "flagd.evaluation.v1.Service",
126
+
}
127
+
],
128
+
"retryPolicy": {
129
+
"MaxAttempts": 3,
130
+
"InitialBackoff": "1s",
131
+
"MaxBackoff": $FLAGD_RETRY_BACKOFF_MAX_MS, // from provider options
132
+
"BackoffMultiplier": 2.0,
133
+
"RetryableStatusCodes": [
134
+
"UNAVAILABLE",
135
+
"UNKNOWN"
136
+
]
137
+
}
138
+
}
139
+
]
140
+
}
141
+
```
142
+
143
+
## Fatal Status Codes
144
+
145
+
Providers accept an option for defining fatal gRPC status codes which, when received in the RPC or sync streams, transition the provider to the PROVIDER_FATAL state.
146
+
This configuration is useful for situations wherein these codes indicate to a client that their configuration is invalid and must be changed (ie: the error is non-transient).
147
+
Examples for this include status codes such as `UNAUTHENTICATED` or `PERMISSION_DENIED`.
123
148
124
149
## RPC Resolver
125
150
@@ -262,28 +287,29 @@ precedence.
262
287
263
288
Below are the supported configuration parameters (note that not all apply to both resolver modes):
264
289
265
-
| Option name | Environment variable name | Explanation | Type & Values | Default | Compatible resolver |
| deadlineMs | FLAGD_DEADLINE_MS | deadline for unary calls, and timeout for initialization | int | 500 | rpc & in-process & file |
275
-
| streamDeadlineMs | FLAGD_STREAM_DEADLINE_MS | deadline for streaming calls, useful as an application-layer keepalive | int | 600000 | rpc & in-process |
276
-
| retryBackoffMs | FLAGD_RETRY_BACKOFF_MS | initial backoff for stream retry | int | 1000 | rpc & in-process |
277
-
| retryBackoffMaxMs | FLAGD_RETRY_BACKOFF_MAX_MS | maximum backoff for stream retry | int | 120000 | rpc & in-process |
278
-
| retryGracePeriod | FLAGD_RETRY_GRACE_PERIOD | period in seconds before provider moves from STALE to ERROR state | int | 5 | rpc & in-process & file |
| deadlineMs | FLAGD_DEADLINE_MS | deadline for unary calls, and timeout for initialization | int | 500 | rpc & in-process & file |
300
+
| streamDeadlineMs | FLAGD_STREAM_DEADLINE_MS | deadline for streaming calls, useful as an application-layer keepalive | int | 600000 | rpc & in-process |
301
+
| retryBackoffMs | FLAGD_RETRY_BACKOFF_MS | initial backoff for stream retry | int | 1000 | rpc & in-process |
302
+
| retryBackoffMaxMs | FLAGD_RETRY_BACKOFF_MAX_MS | maximum backoff for stream retry | int | 120000 | rpc & in-process |
303
+
| retryGracePeriod | FLAGD_RETRY_GRACE_PERIOD | period in seconds before provider moves from STALE to ERROR state | int | 5 | rpc & in-process & file |
| offlinePollIntervalMs | FLAGD_OFFLINE_POLL_MS | poll interval for reading offlineFlagSourcePath | int | 5000 | file |
311
+
| contextEnricher | - | sync-metadata to evaluation context mapping function | function | identity function | in-process |
312
+
| fatalStatusCodes | - | a list of gRPC status codes, which will cause streams to give up and put the provider in a PROVIDER_FATAL state | array |[]| rpc & in-process |
0 commit comments