Skip to content

Commit f670edf

Browse files
lukekimclaudeewgeniusphillipleblanc
authored
feat(http): Add OAuth2 refresh-token auth to HTTP connector (spiceai#10348)
* feat(http): Add OAuth2 refresh-token auth to HTTP connector Adds RFC 6749 §6 refresh-token grant support to the HTTP connector. When configured, the connector exchanges the refresh token at startup, stamps `Authorization: Bearer <access_token>` on every data request, and refreshes in the background before the token expires. Rotated refresh tokens are honored across the lifetime of the process. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(http): Address review comments on OAuth2 refresh-token auth - Sanitize & cap token endpoint error body (collapse whitespace, 512 byte cap) - Reject non-Bearer `token_type` from the token endpoint - Store preformatted, sensitive `HeaderValue` in the watch channel so every data request is a cheap header clone instead of a new format!() allocation - React to shutdown immediately via `tokio::select!` on `tx.closed()` instead of waiting the full sleep interval - Add `refresh_loop_uses_rotated_refresh_token` test that drives the live background loop and asserts the rotated refresh token is used on the second exchange - Use `Parameters::user_param(...)` so configuration errors surface the prefixed user-facing keys (`http_auth_refresh_token`, etc.) instead of internal names Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(http): Further review feedback on OAuth2 refresh-token auth - Reject blank/whitespace-only http_auth_refresh_token at config time, before any network round-trip to the token endpoint - Classify RefreshTokenAuth errors: InvalidTokenUrl / InsecureTokenUrl / UnsupportedTokenType and 4xx responses from the token endpoint surface as InvalidConfiguration so users see configuration guidance instead of "Cannot connect …"; network / 5xx / parse errors stay as UnableToConnectInternal - Inline Parameters::user_param() directly into format!() calls and rename the Option<SecretString> local to client_credential so CodeQL's cleartext-logging heuristic stops flagging parameter-*name* strings as sensitive - OauthServerState in the integration test now uses tokio::sync::Mutex so the axum handlers do not block a Tokio worker thread - Mock OAuth server no longer swallows serve errors via .unwrap_or_default — .expect() surfaces failures directly Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style(http): cargo fmt + switch test URL to https://example.invalid - Apply rustfmt to auth.rs / https.rs / tests/http/mod.rs to clear the Rust Lint CI failure (cargo fmt --check diffs) - Swap the non-sent test URL in refresh_token_auth_applies_bearer_header to https://example.invalid so CodeQL's rust/non-https-url rule stops flagging a hardcoded http:// in test code. The request is only passed through `.build()` to inspect the Authorization header, never sent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(http): Address clippy pedantic warnings in auth.rs - Wrap `OAuth2` in backticks in three doc-comments to satisfy clippy::doc_markdown - `sanitize_error_body` now takes `&str` instead of `String` (removes clippy::needless_pass_by_value); updated call sites - `MockResponse::{ok, status}` take `&serde_json::Value` (same reason); updated callers in the test module - Replace two `if … { panic!(...) }` blocks inside loop bodies with `assert!(… < deadline, …)` to satisfy clippy::panic_in_result_fn / clippy::manual_assert Lint-only changes; no behavior change. Runtime tests and data_components auth tests continue to pass (9/9 and 10/10 respectively). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(http): More Copilot-reviewer nits on OAuth2 refresh-token auth - ParameterSpec descriptions now cross-reference the correct user-facing key names: `auth_token_url` points at `http_auth_refresh_token`, and the `http_auth_client_id` / `http_auth_client_secret` descriptions mention each other explicitly - `map_auth_error` narrows `InvalidConfiguration` mapping from "any 4xx" to `matches!(status, 400 | 401 | 403)`. Transient 4xx (408 Request Timeout, 429 Too Many Requests, etc.) now route to `UnableToConnectInternal`, consistent with the doc-comment - Test doc-comments in `tests/http/mod.rs` updated to name the exact user- facing keys (`http_auth_refresh_token` + `auth_token_url`) - Clarified the whitespace-sanitize comment in `sanitize_error_body`: whitespace characters are *replaced* with a regular space (runs become runs of spaces); the comment no longer implies run-collapse Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(http): Bounded streaming read of token endpoint error body + test cleanup - Replace unbounded `response.text().await` on the error path with a new `read_bounded_error_body` that streams via `Response::chunk()` and stops after `MAX_ERROR_BODY_BYTES * 2` raw bytes. A hostile token endpoint can no longer force us to buffer a large body just so we can surface the first 512 bytes of it as a diagnostic. - Rework `refresh_loop_uses_rotated_refresh_token` to use `tokio::time:: Instant` (instead of `std::time::Instant`) for its 5s deadline poll, keeping real time so reqwest's `connect_timeout` doesn't race with paused virtual time. Added a comment explaining why `start_paused = true` is unsafe here (verified locally: 5/5 runs at 1.06s each, no flakes). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix clippy lints in https connector - Backtick OAuth2 in doc comment (clippy::doc_markdown). - Collapse redundant match-guard into pattern literals (clippy::redundant_guards). * fix(http): Address more reviewer nits on OAuth2 auth module - `current_bearer_value` is now `#[cfg(test)] pub(crate) fn` so downstream callers in production code paths cannot accidentally surface the live bearer token via this API - Rename `sanitize_error_body_collapses_whitespace_and_truncates` to `…_replaces_whitespace_and_truncates` — the implementation *replaces* whitespace characters with spaces (runs preserved as runs of spaces), which the old test name misrepresented as "collapses" Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(http): Enforce hard cap on sanitized error body size `sanitize_error_body` previously filled `out` up to `MAX_ERROR_BODY_BYTES` and then unconditionally appended `…<truncated>`, so the returned string could exceed the documented 512-byte cap by the marker's length. Introduce `TRUNCATION_MARKER` and a `CONTENT_BUDGET` constant that reserves room for the marker, so the final returned string (content + marker) is always ≤ `MAX_ERROR_BODY_BYTES`. Add a `debug_assert!` on the post-condition and tighten the unit test to check `out.len()` directly rather than just the content portion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(http): merge identical match arms in read_bounded_error_body Clippy's `match_same_arms` lint (enabled via `-Dclippy::pedantic` in CI) flagged the `Ok(None)` and `Err(_)` arms both returning `break`. Merged them into a single `Ok(None) | Err(_)` arm with a comment explaining that end-of-body and mid-stream transport errors are both terminal for a best-effort error diagnostic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(http): Improve error message for insecure token URL and add custom headers access method * fix(https): Refactor conditional check for Authorization custom header * fix(http): Collapse identical match arms in map_auth_error clippy::match_same_arms fired after introducing the TokenEndpointStatus { status: 400 | 401 | 403, .. } arm that maps to the same InvalidConfiguration variant as the URL-validation arms. Merge the patterns into a single arm. * Update crates/runtime/src/dataconnector/https.rs Co-authored-by: Phillip LeBlanc <phillip@spice.ai> * Replace .is_err() and .is_ok() in tests with expect * fix clippy - add backticks * refactor(http): Move bounded error body helpers into resilient_http Address phillipleblanc review feedback: promote read_bounded_error_body and sanitize_error_body from auth.rs into resilient_http as public helpers so other HTTP-based data connectors can reuse them. Generalize the cap to a parameter and expose TRUNCATED_BODY_MARKER as part of the public API. * fix(http): Small-cap sanitize_error_body + drop case-sensitive one_of - `sanitize_error_body` previously returned an empty string when `max_bytes < TRUNCATED_BODY_MARKER.len()` because the content budget underflowed to 0. It now uses the full `max_bytes` as content budget without a marker in that case, matching the documented upper bound. New `sanitize_error_body_small_cap_fills_content_without_marker` test. - Removed `.one_of(&["basic", "body"])` from the `auth_client_auth` ParameterSpec: `Parameters::try_new` enforces `one_of` with exact string matching, which rejected `BASIC` / `BODY` before `ClientAuthMethod::parse` (case-insensitive) could accept them. Parser is the single source of truth for validation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Evgenii Khramkov <evgenii@spice.ai> Co-authored-by: Phillip LeBlanc <phillip@spice.ai> Co-authored-by: ewgenius <hey@ewgenius.me>
1 parent 41ff853 commit f670edf

6 files changed

Lines changed: 1653 additions & 5 deletions

File tree

0 commit comments

Comments
 (0)