Commit daf8275
authored
Add exponential backoff on retry for failed queries and metadata fetches (#53)
* Add exponential backoff on retry for failed queries and metadata fetches
The reader previously used constant delays when retrying after failures,
which could overwhelm a struggling cluster. Now both the stream batch
reader (CDC log queries) and the generation fetcher (metadata reads) use
exponential backoff with jitter on consecutive failures, resetting on
success.
- Add MaxPostFailedQueryDelay config (default 30s) as the backoff cap
- Add PostGenerationFetchDelay / MaxPostGenerationFetchDelay config (15s / 5min)
- streamBatchReader doubles PostFailedQueryDelay on each consecutive failure
- generationFetcher backs off from PostGenerationFetchDelay up to max
- Add backoffDelay and addJitter helpers with overflow protection
- Add cdcIterator interface and queryRangeFunc hook for testability
- Add clusterSizeFunc and configurable fetch periods to generationFetcher
Closes #50
* Add tests for backoff on retry: table deleted, query errors, retry limits
Tests cover:
- Exponential backoff on consecutive query failures (row reads)
- Backoff reset after successful query
- Table missing (ErrNotFound, "unconfigured table", "no such table")
gives up after retry limit
- Generic errors retry indefinitely with backoff (no false retry limit)
- Table missing retry counter resets on success
- Generation fetcher backoff on consecutive metadata fetch failures
- Generation fetcher backoff reset on success
- Generation fetcher backoff when getClusterSize fails
- Overflow protection in backoffDelay
- Jitter distribution in addJitter1 parent 29ca672 commit daf8275
6 files changed
Lines changed: 709 additions & 29 deletions
0 commit comments