You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: specs/023-oauth-state-persistence/spec.md
+13-2Lines changed: 13 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,14 @@
5
5
**Status**: Draft
6
6
**Input**: Fix token refresh so OAuth servers survive restarts and proactively refresh before expiration
7
7
8
+
## Clarifications
9
+
10
+
### Session 2026-01-12
11
+
12
+
- Q: What happens when a proactive refresh attempt fails before token expiration? → A: Immediate retry with exponential backoff (10s, 20s, 40s...) up to expiration time, surfacing failures as health status for user visibility across CLI, menubar, and web UI.
13
+
- Q: Should refresh operations emit metrics for observability? → A: Full Prometheus metrics only (`mcpproxy_oauth_refresh_total`, `mcpproxy_oauth_refresh_duration_seconds`), leveraging existing MetricsManager infrastructure.
14
+
- Q: When should refresh schedules be created/destroyed? → A: Schedule exists iff tokens exist (created post-auth or when loaded at startup, destroyed when tokens removed). Implementation details deferred to planning.
15
+
8
16
## Problem Statement
9
17
10
18
OAuth-enabled MCP servers require manual re-authentication after any mcpproxy downtime (restart, laptop sleep, weekend). Despite having refresh tokens stored in the database, the automatic token refresh is not working reliably.
@@ -80,7 +88,7 @@ As a developer, when automatic token refresh fails, I want to understand why it
80
88
- System should detect this on the subsequent connection attempt and report the specific error rather than entering a retry loop.
81
89
82
90
- What happens during a network partition?
83
-
- System should use exponential backoff for refresh retries, with clear status indicating retry attempts.
91
+
- System should use exponential backoff for refresh retries (10s → 20s → 40s → 80s → 5min cap), with health status showing "Refresh retry pending" and next attempt time.
84
92
85
93
## Requirements *(mandatory)*
86
94
@@ -94,11 +102,14 @@ As a developer, when automatic token refresh fails, I want to understand why it
94
102
-**FR-006**: System MUST display distinct health status messages for different refresh failure types (expired refresh token, network error, provider error).
95
103
-**FR-007**: System MUST remove or correct the current misleading "OAuth token refresh successful" logging that reports success when `Start()` returns nil.
96
104
-**FR-008**: System MUST rate-limit refresh attempts to no more than one per 10 seconds per server.
105
+
-**FR-009**: System MUST implement exponential backoff retry (10s, 20s, 40s, 80s, capped at 5 minutes) when proactive refresh fails, continuing attempts until token expiration.
106
+
-**FR-010**: System MUST surface ongoing refresh failures as degraded health status on the upstream server, visible in CLI (`upstream list`), menubar, and web control panel.
107
+
-**FR-011**: System MUST emit Prometheus metrics for OAuth refresh operations: `mcpproxy_oauth_refresh_total` (counter with labels: server, result) and `mcpproxy_oauth_refresh_duration_seconds` (histogram with labels: server, result).
97
108
98
109
### Key Entities
99
110
100
111
-**OAuth Token**: Access token, refresh token, expiration timestamp, token type, scope. Stored in database with server identifier.
101
-
-**Refresh Schedule**: Server name, scheduled refresh time, retry count, last error. Managed by RefreshManager.
112
+
-**Refresh Schedule**: Server name, scheduled refresh time, retry count, last error. Managed by RefreshManager. Lifecycle invariant: exists iff tokens exist for server.
0 commit comments