Skip to content

Conversation

@kevinkim-ogp
Copy link
Contributor

  • Add additional logging for worker status
  • Add retries for Postman email and SMS

pregnantboy and others added 3 commits May 29, 2025 15:10
## Changes

Adding additional logging for redis connection error and worker status
for `closed` and `ready`.

For debugging `failed to refresh slots cache` error.
## TL;DR
Add retry handling for the following transient errors:
* Postman Email: 500, 524, socket hang up
* Postman SMS: 500, 502, 503, read TIMEOUT

## Why make this change?
Postman Email and SMS services occasionally return transient errors due
to server instability or network issues. These errors are often resolved
on subsequent attempts. By retrying affected executions before marking
them as failed, we improve overall system resilience and reduce false
failure rates in Plumber.

## How to test?
- Change the local postman URL to call https://mock.codes
- Use a URL param to point to the different http codes
- Verify that RetriableError is thrown
@kevinkim-ogp kevinkim-ogp requested a review from a team as a code owner May 30, 2025 08:39
@datadog-opengovsg
Copy link

Datadog Report

Branch report: develop-v2
Commit report: 38dfab8
Test service: plumber

✅ 0 Failed, 779 Passed, 0 Skipped, 2m 22.51s Total Time
➡️ Test Sessions change in coverage: 1 no change

Copy link
Contributor

@pregnantboy pregnantboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm: postman-sms-qps lowered to 25

@kevinkim-ogp kevinkim-ogp merged commit 5686ead into production May 30, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants