Skip to content

Fix SSR Suspense livelock when multiple boundaries share Resources via .get()#4579

Open
alilee wants to merge 1 commit intoleptos-rs:mainfrom
alilee:fix/ssr-suspense-livelock-main
Open

Fix SSR Suspense livelock when multiple boundaries share Resources via .get()#4579
alilee wants to merge 1 commit intoleptos-rs:mainfrom
alilee:fix/ssr-suspense-livelock-main

Conversation

@alilee
Copy link

@alilee alilee commented Feb 16, 2026

Summary

One-line fix for an SSR livelock where 2+ Suspense/Transition boundaries sharing 2+ Resources via .get() causes the Suspense Effect to endlessly re-run dry_resolve(), consuming 100% CPU (~297k closure invocations per boundary per 5s) and never completing SSR.

Manifests on AWS Lambda (128MB, 2 cores → worker_threads=2) as intermittent 500 errors.

Root Cause

The Effect's isomorphic closure returns a bool that becomes double_checking: Option<bool>. After dry_resolve() returns true (setting double_checking = Some(true)), the Effect is woken by individual monitor completions. If tasks aren't empty yet, the catch-all false at the end resets double_checking to Some(false). When tasks finally drain, the Effect sees double_checking == Some(false) and calls dry_resolve() again — infinite cycle.

dry_resolve → true → double_checking = Some(true)
  monitor1 completes → Effect wakes → tasks NOT empty → returns false (RESETS!)
  monitor2 completes → Effect wakes → tasks empty + double_checking == Some(false) → dry_resolve again!

The Fix

leptos/src/suspense_component.rs, line 409:

-                false
+                double_checking.unwrap_or(false)

This preserves Some(true) through intermediate wakeups so the completion branch is reached:

dry_resolve → true → double_checking = Some(true)
  monitor1 completes → Effect wakes → tasks NOT empty → returns true (PRESERVED!)
  monitor2 completes → Effect wakes → tasks empty + double_checking == Some(true) → COMPLETION

Only the Some(true) → non-empty-tasks path changes behavior. All other paths return the same value as before.

Reproduction

Minimal repro at https://github.com/alilee/feedback/tree/main/bugs/ssr-race (Cargo.toml + src/main.rs).

cargo run --features ssr  # uses worker_threads = 2
curl -s -m 5 http://localhost:3000/broken | grep -c RESOLVED_RESOURCES  # 0 (livelock)
curl -s -m 5 http://localhost:3000/works  | grep -c RESOLVED_RESOURCES  # 1 (Suspend::new workaround)

Thread count dependency:

worker_threads Result
1 Works
2-3 Livelocks
4+ Works

Testing

  • New regression test: leptos/tests/ssr_suspense_livelock.rs — 2 Suspense boundaries, 2 shared Resources, worker_threads=2, 2s timeout
  • All existing leptos crate tests pass (72 tests: 58 doc + 11 ssr + 2 pr_4061 + 1 new)
  • Test fails without the fix, passes with it

…a .get()

The Suspense Effect's state machine returned `false` as a catch-all default,
which reset `double_checking` from `Some(true)` back to `Some(false)` when the
Effect was woken by intermediate monitor completions that left tasks non-empty.
This caused `dry_resolve()` to be called again when tasks drained, spawning new
monitors in an infinite cycle (~297k closure invocations per boundary per 5s).

The fix: replace the catch-all `false` with `double_checking.unwrap_or(false)`,
preserving the `Some(true)` state through intermediate Effect wakeups so the
completion branch is reached correctly.

Manifests on AWS Lambda (128MB, 2 cores) as intermittent 500 errors.
Reproduces reliably with `worker_threads = 2` or `3`, works with 1 or 4+.
@alilee
Copy link
Author

alilee commented Feb 16, 2026

I've tried to run the regression test suite, but I'm not familiar with the ci so I'm not confident I've fully leveraged the available testing - apologies in advance. Happy to do more work as you direct.

I think you might want to get ready for an avalanche of Claude Code originated PR's and set ultra-convenient standards for yourself, and maybe even provide prompts or agent commands to do what you need so that you can sit on the top - surgeon to all us nurses. You should feel completely justified in asking for people to invest their tokens on tasks! I really hope the shift is net beneficial for you and leptos.

@gbj
Copy link
Collaborator

gbj commented Feb 16, 2026

I'm having trouble reproducing the bug, either against the main branch with the reproduction you provided in the issue or in the PR branch by reverting the change and running the test. The alilee/feedback repo linked in this PR appears to be private (or at least it 404s for me). Could you share that with me, and I'll see if I can reproduce it given that?

@gbj
Copy link
Collaborator

gbj commented Feb 16, 2026

Ah ok, interesting. I am able to reproduce it against the latest published version, but not again git main. Could you confirm whether the bug occurs against the current state of the main branch? I am about to publish a new set of patch releases in any case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants