Skip to content

Latest commit

 

History

History
107 lines (80 loc) · 4.48 KB

File metadata and controls

107 lines (80 loc) · 4.48 KB

Usage Refresh Policy Context

Purpose

This context explains how codex-lb derives an account's usage and status, and how to diagnose disagreements between codex-lb and Codex Desktop or the Codex CLI quota pill.

codex-lb treats /wham/usage as the source of truth for account usage. Other OpenAI account surfaces can display reset state earlier than /wham/usage, especially during team reset windows, so the dashboard can temporarily show an account as rate_limited even when Codex Desktop says the quota has reset.

Upstream Usage Source

codex-lb refreshes account usage by calling:

GET https://chatgpt.com/backend-api/wham/usage

The call is made per account on the configured refresh tick, which defaults to 60 seconds. The client lives in app/core/clients/usage.py, and the scheduler lives in app/core/usage/refresh_scheduler.py.

Status Derivation

The fetched usage is fed through apply_usage_quota, which derives account status from primary_window.used_percent:

  • secondary_used >= 100, regardless of primary_used: QUOTA_EXCEEDED
  • used_percent >= 100 on the primary rate-limit window: RATE_LIMITED
  • used_percent < 100: ACTIVE

There is no manual reset step inside codex-lb. Recovery is driven by the next refresh tick that observes a sub-100 value from /wham/usage.

Why Codex Settings Can Disagree

Codex Desktop's Settings -> Account view and /wham/usage are fed by different OpenAI-side data sources:

  • /wham/usage exposes the rate limiter's internal counter. It updates lazily, typically on the next chargeable request through that account, or when its internal window crosses reset_at.
  • Settings -> Account is fed by a separate account/quota view that often picks up team-side reset events earlier.

During a reset window it is normal for Settings -> Account to show the reset state while /wham/usage still returns used_percent: 100 for a short period afterwards. codex-lb mirrors /wham/usage during that window, so the account stays RATE_LIMITED or QUOTA_EXCEEDED until upstream catches up.

Limit Warm-Up Exhaustion Threshold

Reset-confirmed limit warm-up compares the usage sample from before a refresh with the sample written after that refresh. The pre-refresh sample must be at or above the configured exhausted threshold, the post-refresh sample must be below 100, and reset_at must move forward.

The exhausted threshold defaults to 99.0 because some upstream usage payloads plateau at 99 percent for windows that are practically exhausted. This avoids missing reset-confirmed warm-ups for those accounts while keeping the reset confirmation requirement intact. Operators who want the historical strict behavior can set the threshold to 100.0.

Operational Notes

  • Wait first. The next request through that account usually wakes the upstream rate limiter; codex-lb auto-recovers on the next refresh tick after the upstream payload changes.
  • A force-probe action is planned in #677. The dashboard should expose a per-account button that fires one minimal responses.create against the affected account to nudge the upstream limiter to re-evaluate the window.
  • Do not manually flip the codex-lb account state to ACTIVE while /wham/usage still reports the account as fully used. That only masks the upstream state and can route traffic back to an account that the upstream limiter will reject.

Verification Example

To confirm that the disagreement is upstream rather than codex-lb's mirror, call /wham/usage directly with the same account token codex-lb is using:

ACCESS_TOKEN=...
ACCOUNT_ID=...   # chatgpt-account-id UUID, not codex-lb's id

curl -s https://chatgpt.com/backend-api/wham/usage \
  -H "Authorization: Bearer ${ACCESS_TOKEN}" \
  -H "chatgpt-account-id: ${ACCOUNT_ID}" \
  -H "Accept: application/json" | jq '.rate_limit'

If primary_window.used_percent is still 100 here while Settings -> Account shows the account as reset, codex-lb has nothing fresher to mirror. The account is inside the upstream propagation window, and the practical fix is to wait or, once #677 lands, use the Probe action.

Related Work