Skip to content

auth-js: _initialize deadlocks when onAuthStateChange registers during init while session is within 90s of expiry #2344

@tjstarfighter-cmd

Description

@tjstarfighter-cmd

Bug report

Filing here since supabase/auth-js is archived. Affects the auth-js package in packages/core/auth-js/.

Summary

When an app registers onAuthStateChange synchronously during _initialize's lock callback and the persisted session is within the 90s expiry threshold, the auth client deadlocks. _initialize never resolves, the navigator Web Lock named lock:sb-<ref>-auth-token is held forever, and every subsequent getSession() / getUser() / from() call queues behind it.

Affected version

Reproduced on @supabase/supabase-js@2.105.4 (which bundles @supabase/auth-js) and @supabase/ssr@0.10.3.

Reproduction

useEffect(() => {
  const supabase = createBrowserClient(URL, ANON);
  // Both calls fire synchronously in the same tick:
  supabase.auth.getUser().then(...);
  supabase.auth.onAuthStateChange(cb);   // ← registers DURING init
}, []);

Conditions for the hang:

  1. A persisted session in storage with expires_at within 90 seconds of now (e.g. happens on reload near token-refresh boundary).
  2. App registers onAuthStateChange in the same useEffect tick as init kicks off.

Symptoms (introspected via the singleton client exposed for diagnostics):

Property Value
auth.lockAcquired true (held forever)
auth.pendingInLock.length 1+ (never drains)
auth._initialized false
auth.currentSession undefined
await navigator.locks.query() one held lock named lock:sb-<ref>-auth-token
auth._refreshAccessToken(refreshToken) called directly resolves in <1s with a fresh session

Mechanism

  1. _initialize() acquires the Web Lock and runs _recoverAndRefresh().
  2. Session is < 90s from expiry, so _recoverAndRefresh() calls _callRefreshToken(). This sets refreshingDeferred and starts the refresh.
  3. Concurrently, the app's useEffect calls supabase.auth.onAuthStateChange(cb). The listener registration calls _emitInitialSession(id) for the new listener.
  4. _emitInitialSession_useSession__loadSession. __loadSession itself contains a refresh path:
    let s = !!t.expires_at && t.expires_at*1000 - Date.now() < 9e4;
    if (s) await this._callRefreshToken(t.refresh_token);
    The SDK source already warns about this at the top of __loadSession:
    this.lockAcquired || this._debug('#__loadSession()', 'used outside of an acquired lock!', Error().stack);
  5. _callRefreshToken is idempotent via refreshingDeferred, so the second call awaits the same in-flight promise. The refresh itself completes — the persisted cookie ends up with fresh tokens.
  6. However, somewhere in the resulting Promise composition (the _callRefreshToken .then chain inside _emitInitialSession's _useSession), an entry is left in pendingInLock that never resolves.
  7. _acquireLock's drain loop blocks forever:
    while (this.pendingInLock.length) {
      let arr = [...this.pendingInLock];
      await Promise.all(arr);
      this.pendingInLock.splice(0, arr.length);
    }
    Drain never completes → Web Lock never released → _initialized never set → all later operations stuck.

What works vs what hangs

  • auth.storage.getItem(storageKey) directly: returns in 1ms with the parsed session string.
  • auth._refreshAccessToken(refreshToken) directly: resolves in ~850ms with a fresh session.
  • auth.getSession() / auth.getUser() / any from(...).select(...): hang indefinitely.
  • Direct fetch() to /rest/v1/<table> with the cookie's bearer token: returns in <1s. Only the SDK wrapper is stuck.

Why first-load post-deploy "works" but reload hangs

  • First load after fresh sign-in: session is far from the 90s threshold; _recoverAndRefresh doesn't call _callRefreshToken; no race.
  • Reload near expiry: refresh path fires; race triggers.

This pattern likely explains other intermittent / "sometimes hangs on reload" reports.

App-side workaround

In all hooks that call onAuthStateChange, defer the registration until getUser() (or getSession()) resolves:

useEffect(() => {
  const supabase = createBrowserClient(URL, ANON);
  let mounted = true;
  let unsubscribe: (() => void) | undefined;

  const setup = async () => {
    try {
      const { data: { user } } = await supabase.auth.getUser();
      if (!mounted) return;
      // ... use user
    } catch { /* ignore */ }

    if (!mounted) return;
    const { data: { subscription } } = supabase.auth.onAuthStateChange(cb);
    unsubscribe = () => subscription.unsubscribe();
  };

  setup();
  return () => { mounted = false; unsubscribe?.(); };
}, []);

getUser() awaits initializePromise cleanly via _acquireLock's blessed path. By the time the listener subscribes, _initialize has fully resolved and the lock has been released, so _emitInitialSession runs in the post-init state where it can't race.

This fully eliminated the hang for us in production. Verified by manually setting cookie expires_at to now+85s and reloading: pre-fix → deadlock every time; post-fix → refresh fires inside the lock, completes, drains, page renders normally.

Suggested upstream fix (any of)

  1. Make __loadSession skip the in-line refresh when called outside the lock. When lockAcquired is false, return the (possibly-soon-to-be-stale) session from storage and let the lock holder drive the refresh. The pre-existing outside of an acquired lock! warning hints this path was already known to be dangerous.
  2. Defer _emitInitialSession until after initializePromise resolves. A natural await this.initializePromise at the top of _emitInitialSession would prevent it from running concurrently with the lock callback.
  3. Add an upper time bound to _acquireLock's drain loop so a buggy queue entry doesn't permanently brick the client (defense in depth).

Environment

  • @supabase/supabase-js@2.105.4, @supabase/ssr@0.10.3
  • Browser: Chromium (verified in Playwright Chromium and real Chrome)
  • App stack: Next.js 16 App Router, React 19
  • Reproduces against a vanilla createBrowserClient setup with a session that happens to fall within 90s of expiry on reload

Diagnostic harness used

A ?<flag>=1-gated probe in our lib/supabase/client.ts exposed window.__supabase, logged getSession() resolution timing, and ticked _initialized / currentSession state every second. Combined with navigator.locks.query() and Function.prototype.toString reading of the deminified bundled source, this localized the bug to the exact _acquireLock drain step. Happy to share the harness if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    auth-jsRelated to the auth-js library.bugSomething isn't workingtemp - locksTemporary label to group together all the lock-related issues and PRs.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions