fix(paddle): retry rate-limit and 5xx responses with backoff by Gilbert09 · Pull Request #65041 · PostHog/posthog

Gilbert09 · 2026-06-20T21:43:53Z

Problem

Error tracking surfaced an HTTPError from the Paddle data-warehouse import source:

429 Client Error: Too Many Requests for url: https://api.paddle.com/transactions?...

The Paddle source builds its HTTP session with Retry(total=0), so a single 429 (Paddle rate-limiting a paginated transactions fetch) propagated straight out of response.raise_for_status() in get_rows and failed the sync. A 429 means "slow down and retry", not "give up" — every sibling source (Brex, Stripe, Asana, Adroll, …) already backs off and retries on 429/5xx. Paddle was the outlier.

Changes

_get_paddle_session() now uses the framework's DEFAULT_RETRY instead of opting out with Retry(total=0):

Backs off and retries on 429/5xx, honoring the Retry-After header (urllib3's respect_retry_after_header defaults on).
Leaves auth/4xx (400/401/403) failing fast — those aren't in the retry status list — so they still reach get_non_retryable_errors and surface a clear, non-retried message to the user.
raise_on_status=False means a persistent 429 still surfaces via raise_for_status() (retryable at the Temporal layer), rather than being silently swallowed.

This is a robustness fix, not a NonRetryableErrors change: a rate limit is transient and upstream-recoverable, so the right behavior is to retry it, not to stop retrying.

How did you test this code?

I'm an agent. I added a regression test (tests/test_paddle.py) asserting the session's retry policy retries 429 (and honors Retry-After) while leaving 400/401/403 un-retried. Before the fix the session used Retry(total=0), so the rate-limit assertion would have failed. Ran the new tests locally (2 passed) and ran ruff check/ruff format on the source.

Automatic notifications

Publish to changelog?
Alert Sales and Marketing teams?

🤖 Agent context

Autonomy: Fully autonomous

Triaged from an error-tracking webhook for the Paddle source. Confirmed the stack originates in get_rows at paddle/paddle.py (the raise_for_status() call), then traced the session config to the deliberate Retry(total=0). Decision: a 429 is a transient rate limit, so it belongs in retry/backoff handling, not NonRetryableErrors — matching the established 429 or >= 500 pattern across the other warehouse sources and the framework's own DEFAULT_RETRY. Kept the change minimal: swap the retry policy and update the now-stale fail-fast comments; no behavior change for auth errors.

The Paddle source built its HTTP session with `Retry(total=0)`, so a single 429 Too Many Requests surfaced straight out of `response.raise_for_status()` and failed the whole sync. Rate limits are transient and should be backed off and retried, not treated as terminal. Switch the session to the framework's `DEFAULT_RETRY`, which backs off on 429/5xx and honors `Retry-After`, while still letting auth/4xx errors fail fast so they reach `get_non_retryable_errors`. Generated-By: PostHog Code Task-Id: 8cbc7a7e-838f-4c6c-b5ac-9f9d60f2ae5d

github-actions · 2026-06-20T21:44:04Z

Hey @Gilbert09! 👋

It looks like your git author email on this PR isn't your @posthog.com address (owerstom@gmail.com). Since you're on the PostHog team, it's worth pointing your local git author email at your @posthog.com address. Why it matters:

Consistent work identity in git history — internal tooling that attributes commits to team members keys off your @posthog.com address.
Keeps team contributions easy to tell apart from external community ones when scanning history.

You can fix it for this repo with:

git config user.email "you@posthog.com"

Or set it globally with git config --global user.email "you@posthog.com". No need to redo this PR — just a nudge for next time. 🙂

hex-security-app · 2026-06-20T21:44:36Z

-    return make_tracked_session(retry=Retry(total=0))
+    # DEFAULT_RETRY backs off on 429/5xx (honoring Retry-After) but leaves auth/4xx
+    # failures to surface immediately, so a transient rate-limit doesn't fail the sync.
+    return make_tracked_session(retry=DEFAULT_RETRY)


Paddle API key is no longer redacted in tracked HTTP telemetry

_get_paddle_session() now opts into the shared tracked session with DEFAULT_RETRY, but it still doesn't pass redact_values=(api_key,) or set the auth header on the session itself. This source sends the bearer token on every request, and the tracked transport can capture matching requests/responses to object storage for operators. Without value-based redaction, a Paddle API key that appears outside the standard header scrub path (for example in serialized request metadata or retries/debug captures) can be written to logs/sample artifacts, exposing a live third-party credential.

Prompt To Fix With AI

Update the Paddle connector to construct its tracked session with credential redaction enabled, e.g. thread the API key into `_get_paddle_session(api_key)` and call `make_tracked_session(retry=DEFAULT_RETRY, headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}, redact_values=(api_key,))`. Then remove per-request auth headers where possible so both normal logging and sample capture consistently scrub the secret across URLs, headers, and bodies.

_{Severity: medium | Confidence: 79% | React with 👍 if useful or 👎 if not}

stamphog

The hex-security-app bot raised an unresolved, current-head medium-severity security concern: the Paddle API key (Bearer token) is not passed as redact_values to make_tracked_session(), so it will appear unredacted in any HTTP telemetry captured by the tracked transport. This pattern is established across 82 other sources in the codebase; Paddle is a clear outlier.

greptile-apps · 2026-06-20T21:46:51Z

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
posthog/temporal/data_imports/sources/paddle/tests/test_paddle.py:4-24
The test inspects `status_forcelist` membership directly rather than using `retry.is_retry("GET", status_code)`, which is the established pattern used in `TestConvexRetryPolicy`. Checking `status_forcelist` alone does not account for `allowed_methods` — if the retry's `allowed_methods` excluded GET, the 429 assertion would pass even though no retry would actually fire. Using `is_retry` exercises the full retry-decision logic, which is what matters for Paddle's GET calls.

```suggestion
class TestPaddleSession:
    def test_session_retries_rate_limits(self):
        session = _get_paddle_session()
        retry = session.get_adapter(PADDLE_BASE_URL).max_retries

        # A transient 429 must back off and retry rather than failing the whole sync.
        assert retry.total is not None and retry.total > 0
        assert retry.is_retry("GET", 429) is True
        assert retry.respect_retry_after_header is True
        # Persistent failures still surface via response.raise_for_status(), not MaxRetryError.
        assert retry.raise_on_status is False

    def test_auth_failures_are_not_retried(self):
        session = _get_paddle_session()
        retry = session.get_adapter(PADDLE_BASE_URL).max_retries

        # 401/403/400 are credential/config problems handled by get_non_retryable_errors;
        # retrying them would only delay surfacing the error to the user.
        assert retry.is_retry("GET", 401) is False
        assert retry.is_retry("GET", 403) is False
        assert retry.is_retry("GET", 400) is False
```

_{Reviews (1): Last reviewed commit: "fix(paddle): retry rate-limit and 5xx re..." | Re-trigger Greptile}

greptile-apps · 2026-06-20T21:46:54Z

+class TestPaddleSession:
+    def test_session_retries_rate_limits(self):
+        session = _get_paddle_session()
+        retry = session.get_adapter(PADDLE_BASE_URL).max_retries
+
+        # A transient 429 must back off and retry rather than failing the whole sync.
+        assert retry.total is not None and retry.total > 0
+        assert 429 in retry.status_forcelist
+        assert retry.respect_retry_after_header is True
+        # Persistent failures still surface via response.raise_for_status(), not MaxRetryError.
+        assert retry.raise_on_status is False
+
+    def test_auth_failures_are_not_retried(self):
+        session = _get_paddle_session()
+        retry = session.get_adapter(PADDLE_BASE_URL).max_retries
+
+        # 401/403/400 are credential/config problems handled by get_non_retryable_errors;
+        # retrying them would only delay surfacing the error to the user.
+        assert 401 not in retry.status_forcelist
+        assert 403 not in retry.status_forcelist
+        assert 400 not in retry.status_forcelist


The test inspects status_forcelist membership directly rather than using retry.is_retry("GET", status_code), which is the established pattern used in TestConvexRetryPolicy. Checking status_forcelist alone does not account for allowed_methods — if the retry's allowed_methods excluded GET, the 429 assertion would pass even though no retry would actually fire. Using is_retry exercises the full retry-decision logic, which is what matters for Paddle's GET calls.

Suggested change

class TestPaddleSession:

def test_session_retries_rate_limits(self):

session = _get_paddle_session()

retry = session.get_adapter(PADDLE_BASE_URL).max_retries

# A transient 429 must back off and retry rather than failing the whole sync.

assert retry.total is not None and retry.total > 0

assert 429 in retry.status_forcelist

assert retry.respect_retry_after_header is True

# Persistent failures still surface via response.raise_for_status(), not MaxRetryError.

assert retry.raise_on_status is False

def test_auth_failures_are_not_retried(self):

session = _get_paddle_session()

retry = session.get_adapter(PADDLE_BASE_URL).max_retries

# 401/403/400 are credential/config problems handled by get_non_retryable_errors;

# retrying them would only delay surfacing the error to the user.

assert 401 not in retry.status_forcelist

assert 403 not in retry.status_forcelist

assert 400 not in retry.status_forcelist

class TestPaddleSession:

def test_session_retries_rate_limits(self):

session = _get_paddle_session()

retry = session.get_adapter(PADDLE_BASE_URL).max_retries

# A transient 429 must back off and retry rather than failing the whole sync.

assert retry.total is not None and retry.total > 0

assert retry.is_retry("GET", 429) is True

assert retry.respect_retry_after_header is True

# Persistent failures still surface via response.raise_for_status(), not MaxRetryError.

assert retry.raise_on_status is False

def test_auth_failures_are_not_retried(self):

session = _get_paddle_session()

retry = session.get_adapter(PADDLE_BASE_URL).max_retries

# 401/403/400 are credential/config problems handled by get_non_retryable_errors;

# retrying them would only delay surfacing the error to the user.

assert retry.is_retry("GET", 401) is False

assert retry.is_retry("GET", 403) is False

assert retry.is_retry("GET", 400) is False

Prompt To Fix With AI

This is a comment left during a code review. Path: posthog/temporal/data_imports/sources/paddle/tests/test_paddle.py Line: 4-24 Comment: The test inspects `status_forcelist` membership directly rather than using `retry.is_retry("GET", status_code)`, which is the established pattern used in `TestConvexRetryPolicy`. Checking `status_forcelist` alone does not account for `allowed_methods` — if the retry's `allowed_methods` excluded GET, the 429 assertion would pass even though no retry would actually fire. Using `is_retry` exercises the full retry-decision logic, which is what matters for Paddle's GET calls. ```suggestion class TestPaddleSession: def test_session_retries_rate_limits(self): session = _get_paddle_session() retry = session.get_adapter(PADDLE_BASE_URL).max_retries # A transient 429 must back off and retry rather than failing the whole sync. assert retry.total is not None and retry.total > 0 assert retry.is_retry("GET", 429) is True assert retry.respect_retry_after_header is True # Persistent failures still surface via response.raise_for_status(), not MaxRetryError. assert retry.raise_on_status is False def test_auth_failures_are_not_retried(self): session = _get_paddle_session() retry = session.get_adapter(PADDLE_BASE_URL).max_retries # 401/403/400 are credential/config problems handled by get_non_retryable_errors; # retrying them would only delay surfacing the error to the user. assert retry.is_retry("GET", 401) is False assert retry.is_retry("GET", 403) is False assert retry.is_retry("GET", 400) is False ``` How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Gilbert09 added the stamphog Request AI review from stamphog label Jun 20, 2026

assign-reviewers-posthog Bot assigned Gilbert09 Jun 20, 2026

assign-reviewers-posthog Bot requested a review from a team June 20, 2026 21:44

hex-security-app Bot reviewed Jun 20, 2026

View reviewed changes

stamphog Bot reviewed Jun 20, 2026

View reviewed changes

stamphog Bot removed the stamphog Request AI review from stamphog label Jun 20, 2026

greptile-apps Bot reviewed Jun 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(paddle): retry rate-limit and 5xx responses with backoff#65041

fix(paddle): retry rate-limit and 5xx responses with backoff#65041
Gilbert09 wants to merge 1 commit into
masterfrom
posthog-code/paddle-retry-rate-limits

Gilbert09 commented Jun 20, 2026

Uh oh!

github-actions Bot commented Jun 20, 2026

Uh oh!

hex-security-app Bot Jun 20, 2026

Uh oh!

stamphog Bot left a comment

Uh oh!

greptile-apps Bot commented Jun 20, 2026

Uh oh!

greptile-apps Bot Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Gilbert09 commented Jun 20, 2026

Problem

Changes

How did you test this code?

Automatic notifications

🤖 Agent context

Uh oh!

github-actions Bot commented Jun 20, 2026

Uh oh!

hex-security-app Bot Jun 20, 2026

Choose a reason for hiding this comment

Uh oh!

stamphog Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Jun 20, 2026

Uh oh!

greptile-apps Bot Jun 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant