[codex] Tighten auth flows and unify live canary coverage by ilblackdragon · Pull Request #2367 · nearai/ironclaw

ilblackdragon · 2026-04-12T11:12:04Z

Summary

This branch tightens extension and tool auth handling and unifies the auth live-canary system so future provider coverage follows one path.

It includes:

consolidating extension/tool auth ownership under src/auth/
fixing remaining auth URL and redirect edge cases
partitioning MCP auth/session/client state by user to fix same-server multi-user isolation
adding backend and browser coverage for two users interacting with the same MCP server
adding seeded-token and browser-consent auth canaries for Google, GitHub, and Notion
adding GitHub OAuth browser flow support for the GitHub tool with PAT fallback
unifying the auth canary runners behind shared scripts/live_canary/ framework modules and one canonical account/setup guide

Why

We had auth regressions and open MCP multi-user isolation issues, and the live-canary work had started to split across multiple runner shapes. This branch fixes the highest-risk auth isolation bug, expands end-to-end auth coverage, and gives future auth/provider canaries a single setup and extension path.

Validation

Ran targeted checks during the branch work, including:

targeted cargo test coverage for auth URL sanitization, redirect handling, MCP session partitioning, factory partitioning, runtime-user MCP wrapper execution, and MCP multi-tenant integration
targeted pytest coverage for auth/browser scenarios in tests/e2e/scenarios/test_extensions.py and tests/e2e/scenarios/test_v2_auth_oauth_matrix.py
python3 -m py_compile for the shared live-canary modules and auth runners
bash -n scripts/live-canary/run.sh scripts/live-canary/scrub-artifacts.sh
wrapper discovery checks via scripts/live-canary/run.sh --list-tests and --list-cases

Impact

auth behavior is safer for multi-user MCP deployments
live-provider auth coverage now has deterministic, seeded, and browser-consent lanes under one wrapper
future auth canary additions should go through scripts/live_canary/auth_registry.py and scripts/live-canary/ACCOUNTS.md

gemini-code-assist

Code Review

This pull request introduces a comprehensive live canary regression system for auth flows, including new browser-based and seeded-token runners. It also refactors the authentication manager and MCP client architecture to support multi-user isolation, ensuring that MCP sessions and tokens are correctly partitioned by user. My review highlights a potential issue with the McpClient::for_user implementation regarding shared state for Stdio transports, which could lead to request ID collisions and redundant handshakes.

src/tools/mcp/client.rs

…anary-unification

serrrfirat · 2026-04-13T05:20:52Z

scripts/live-canary/scrub-artifacts.sh

+    *.png|*.jpg|*.jpeg|*.gif|*.webp|*.sqlite|*.db|*.wasm|*.zip) continue ;;
+  esac
+  for pattern in "${patterns[@]}"; do
+    if grep -nIEi "${pattern}" "${file}" >> "${matches_file}" 2>/dev/null; then


Critical Severity

This scrubber persists raw secret matches inside the same artifact directory it is supposed to sanitize. grep appends matching lines to artifacts/live-canary/scrub-matches.txt, and the workflow later uploads artifacts/live-canary/; in non-strict lanes it even continues after matches are found. That means a leaked bearer token/PAT in any log gets copied into a new artifact file and uploaded. The console redaction also only handles some prefix forms, so raw ghp_, github_pat_, ya29, etc. matches can still be printed unredacted.

Please write matches to a temp file outside the upload tree, ensure every persisted/printed match is redacted before it exists under the artifact directory, and make secret matches fail or remove the affected upload payload.

serrrfirat · 2026-04-13T05:20:52Z

tools-src/github/github-tool.capabilities.json

+    "provider": "github",
+    "oauth": {
+      "authorization_url": "https://github.com/login/oauth/authorize",
+      "token_url": "https://github.com/login/oauth/access_token",


High Severity

The new GitHub OAuth descriptor points at https://github.com/login/oauth/access_token, but the shared OAuth exchanger parses the token response as JSON and does not set Accept: application/json. GitHub returns a form-encoded access-token response by default and only returns JSON when that Accept header is present, so this browser OAuth path will fail after the user consents.

Please add provider token request headers or a urlencoded fallback for this descriptor, and cover it with a mock GitHub token endpoint regression test.

serrrfirat · 2026-04-13T06:14:43Z

src/extensions/manager.rs

        // the same credential should reuse a single pending entry rather than
        // accumulate stale flows. This logic used to live in
-        // bridge::auth_manager and was lost when the call moved here; without
+        // crate::auth::extension and was lost when the call moved here; without


High Severity

This pending-flow path was made user-aware here, but the surrounding pending-auth lifecycle is still extension-scoped: pending_auth is keyed only by extension name, and clear_pending_extension_auth(name) / remove() retain gateway flows only by flow.extension_name != name. That breaks multi-user auth isolation: if user A has a pending OAuth flow for github/notion and user B starts auth for the same extension, B's auth call clears A's pending callback state, so A's eventual callback can no longer complete.

Please key pending auth cleanup by (user_id, extension) or thread user_id into clear_pending_extension_auth, and add a two-user same-extension pending OAuth regression test.

serrrfirat

Code review covering multi-user MCP isolation, credential handling in live canary infrastructure, and auth URL sanitization.

serrrfirat · 2026-04-13T10:07:13Z

src/tools/mcp/client.rs

        let url: String = server_url.into();
        let name = extract_server_name(&url);
        let transport = Arc::new(HttpMcpTransport::new(url.clone(), name.clone()));
+        let runtime_state = Self::new_runtime_state();


Critical Severity — Multi-user MCP isolation fragility for stdio transports

When for_user() is called on stdio transports, it shares initialized (OnceCell), tools_cache, and next_id across all users (lines 316-321). The first user's initialization wins and InitializeResult (server capabilities) is frozen from that user's perspective. If a future MCP server returns user-scoped capabilities, they leak across users.

The test test_stdio_for_user_shares_initialize_cache_and_request_ids validates this as intentional behavior, which is fine for current MCP servers. However, this shared-state assumption should be documented as a contract on McpClient itself (not just in the Clone impl doc), so future maintainers know adding user-scoped capabilities to stdio transports would require breaking this sharing.

Suggested fix: Add a doc comment on for_user() explicitly stating that stdio transports share initialization/tool-cache across all users by design, and that changing this would require per-user OnceCells.

serrrfirat · 2026-04-13T10:07:14Z

scripts/live-canary/scrub-artifacts.sh

+    -e 's/(secret[[:space:]]*[:=][[:space:]]*)[^[:space:]]+/\1<REDACTED>/Ig' \
+    "${matches_file}" | head -200
+  if [[ "${STRICT_ARTIFACT_SCRUB}" == "true" || "${STRICT_ARTIFACT_SCRUB}" == "1" ]]; then
+    exit 1


High Severity — STRICT_ARTIFACT_SCRUB not set for lanes handling real credentials

STRICT_ARTIFACT_SCRUB is only set to true for the private-oauth lane (visible in .github/workflows/live-canary.yml). However, auth-live-seeded and auth-browser-consent lanes handle real provider tokens (Google OAuth, GitHub PAT) but upload artifacts with only the best-effort regex scrub here. If the regex misses a token format, it flows into uploaded CI artifacts.

Line 47: if [[ "${STRICT_ARTIFACT_SCRUB}" == "true" ...]] — the fallthrough prints a warning and continues.

Suggested fix: Set STRICT_ARTIFACT_SCRUB=true for all lanes that handle real provider credentials (auth-live-seeded, auth-browser-consent), not just private-oauth. The regex scrub is best-effort and should not be the only gate for lanes with real tokens.

serrrfirat · 2026-04-13T10:07:14Z

scripts/live_canary/common.py

+DEFAULT_VENV = E2E_DIR / ".venv"
+DEFAULT_SECRETS_MASTER_KEY = (
+    "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"
+)


High Severity — Hardcoded predictable master key in committed source

DEFAULT_SECRETS_MASTER_KEY = ( "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" )

This key is committed to the repo. Canary databases created with it (which include real provider tokens for auth-live-seeded) can be trivially decrypted by anyone reading this source. While the canary databases are ephemeral, there is a window during which real OAuth tokens are encrypted with a publicly-known key.

Suggested fix: Use os.urandom(32).hex() per canary run to generate a unique master key. This ensures that even if a canary database leaks (e.g., via CI artifacts), the tokens inside cannot be decrypted. Verify cleanup covers all failure modes so the DB+key are never persisted together.

serrrfirat · 2026-04-13T10:07:14Z

src/auth/oauth.rs

+            return None;
+        }
+        url::Url::parse(u)
+            .ok()


High Severity — Auth URL sanitization returns original string, not parsed URL

sanitize_auth_url parses with url::Url::parse() but returns the original string u.to_owned(). Percent-encoded CRLF sequences (%0d%0a) pass the char::is_control check (which only catches literal control chars in the input string) and flow through to the caller. After the URL is later used in an HTTP context (e.g., Location header), the server or browser may decode the percent-encoding, enabling header injection.

The existing test rejects_invalid_or_control_character_urls only tests literal \n and \r, not their percent-encoded forms.

Suggested fix: Return parsed.to_string() instead of u.to_owned() to get the normalized, canonicalized URL. Also add test cases for %0d%0a sequences.

serrrfirat · 2026-04-13T10:07:14Z

src/tools/mcp/session.rs

    }

    /// Get or create a session for a server.
-    pub async fn get_or_create(&self, server_name: &str, server_url: &str) -> McpSession {


Medium Severity — Unbounded session growth

Session map is now keyed by (user_id, server_name) which increases the cardinality compared to the previous server-only key. cleanup_stale() exists but there is no evidence it is called periodically (no timer, no background task scheduling it). In a multi-user deployment, the session map grows without bound until the process restarts.

Suggested fix: Either call cleanup_stale() on a timer (e.g., every 5 minutes via a tokio interval task) or add a max capacity with LRU eviction. The unbounded growth is more impactful now that the key space is O(users * servers) rather than O(servers).

serrrfirat · 2026-04-13T10:07:14Z

src/tools/mcp/client.rs

@@ -63,10 +63,10 @@ pub struct McpClient {
    server_name: String,



Medium Severity — No user_id validation in for_user()

for_user() accepts any impl Into<String>. An empty string or a string containing path separators could cause issues in session keys, secret paths, or log parsing. Given that the user_id flows into McpSessionKey, secret lookups, and header values, basic validation would prevent subtle bugs.

Suggested fix: Validate that user_id is non-empty and does not contain path separators (/, \) or control characters. Return a Result or panic-in-debug to surface misuse early.

serrrfirat · 2026-04-13T10:07:14Z

src/tools/mcp/client.rs


        Ok(Self {
            transport,
            server_url: config.url.clone(),


Medium Severity — New HTTP transport created per tool call

In McpToolWrapper::execute(), self.client.for_user(&ctx.user_id) is called on every tool execution. For HTTP transports, for_user() creates a new HttpMcpTransport with a fresh OnceCell, meaning the MCP initialize handshake is attempted on every single tool call. This adds latency and unnecessary server load.

For stdio transports this is fine (shared runtime state), but HTTP callers pay a per-call initialization tax.

Suggested fix: Cache the per-user client in McpToolWrapper or ExtensionManager so that repeated calls from the same user reuse the same transport and initialization state.

serrrfirat and others added 5 commits April 12, 2026 07:48

ci: add live canary regression lanes

8b37eeb

test: tighten live zizmor canary prompt

d6f5ec5

feat(auth): harden extension auth and unify canary lanes

78750c1

merge: integrate upstream live canary lanes

d25a4b3

refactor(canary): unify auth live canary framework

87b6e50

gemini-code-assist bot reviewed Apr 12, 2026

View reviewed changes

src/tools/mcp/client.rs Show resolved Hide resolved

ilblackdragon added 3 commits April 12, 2026 12:04

fix(mcp): share stdio runtime state across user views

7b96690

Merge remote-tracking branch 'origin/staging' into codex/auth-oauth-c…

442877a

…anary-unification

fix(ci): mark root crate unpublished

793d021

github-actions bot added the scope: dependencies Dependency updates label Apr 12, 2026

merge: sync origin/staging

19da65c

serrrfirat mentioned this pull request Apr 12, 2026

[codex] Add full auth ops canary coverage #2376

Closed

serrrfirat reviewed Apr 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Tighten auth flows and unify live canary coverage#2367

[codex] Tighten auth flows and unify live canary coverage#2367
ilblackdragon wants to merge 9 commits intostagingfrom
codex/auth-oauth-canary-unification

ilblackdragon commented Apr 12, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

serrrfirat Apr 13, 2026

Uh oh!

serrrfirat Apr 13, 2026

Uh oh!

serrrfirat Apr 13, 2026

Uh oh!

serrrfirat left a comment

Uh oh!

serrrfirat Apr 13, 2026

Uh oh!

serrrfirat Apr 13, 2026

Uh oh!

serrrfirat Apr 13, 2026

Uh oh!

serrrfirat Apr 13, 2026

Uh oh!

serrrfirat Apr 13, 2026

Uh oh!

serrrfirat Apr 13, 2026

Uh oh!

serrrfirat Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -63,10 +63,10 @@ pub struct McpClient {
		server_name: String,

Conversation

ilblackdragon commented Apr 12, 2026

Summary

Why

Validation

Impact

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

serrrfirat left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants