Skip to content

RFC: Add a Retry Policy to allow the client to dynamically retry requests #1874

@tooboredtocode

Description

@tooboredtocode

Add a Retry Policy to allow the client to dynamically retry requests

Currently the client does not retry failed requests. However the client will, even while strictly following ratelimits, encounter occasional 429 responses.
To combat this the client will have to retry some failed requests. This could be implemented rather simply. However given that other errors might occasionally happen in some environments, implementing a dynamic policy that can be easily modified by a user will be a smarter choice.

Implementation

One way to achieve this would be to implement the following traits alongside their default implementation

pub trait RetryPolicy<Ex: RetryPolicyExecutor> {
    fn handle_request(&self) -> Ex;
}

pub trait RetryPolicyExecutor {
    fn should_retry(&self, data: FailedRequestData) -> Option<Duration>;
}
And a corresponding default implementation
#[derive(Debug, Clone, Copy)]
pub struct DefaultRetryPolicy {
    max_tries: u16,
    timeout: Duration
}

impl DefaultRetryPolicy {
    pub fn new(max_tries: u16, timeout: Duration) -> Self {
        Self {
            max_tries,
            timeout
        }
    }
}

impl Default for DefaultRetryPolicy {
    fn default() -> Self {
        Self {
            max_tries: 5,
            timeout: Duration::from_secs(5)
        }
    }
}

impl RetryPolicy<DefaultRetryPolicy> for DefaultRetryPolicy {
    fn handle_request(&self) -> DefaultRetryPolicy {
        *self
    }
}

impl RetryPolicyExecutor for DefaultRetryPolicy {
    fn should_retry(&self, data: FailedRequestData) -> Option<Duration> {
        if self.max_tries <= data.current_try {
            return None;
        }

        match data.error {
            RequestError::Non2xxResponse {
                status: StatusCode(429),
                ..
            } => Some(Duration::from_millis(0)),
            RequestError::Non2xxResponse { status, .. }
            if status.is_server_error() => Some(self.timeout),
            _ => None
        }
    }
}
And a potential Reqest Data Implementation
pub struct FailedRequestData {
    pub route: Route<'_>,
    pub error: RequestError,
    pub current_try: u16,
}

#[non_exhaustive]
pub enum RequestError {
    ConnectionFailed,
    ResponseTimedOut,
    Non2xxResponse {
        status: StatusCode,
        headers: HashMap<String, String>
    },
}

The trait splits the code into two parts:

  • the RetryPolicy
  • and its Executioner

The Retry Policy can be used to generate an Executioner for each request collection (a request and its potential retries)
The Executioner can then be used to decide whether to retry or not and how long to wait until the next attempt based on the Request data

Separating those two allows someone implementing a custom implementation to differentiate errors happening in different requests, for example to track the count of different errors.

Alternative

Alternatively you could just retry requests hitting a ratelimit.
However doing so would be missing out on a lot of potential customisation for users.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions