Fix SSR Suspense livelock when multiple boundaries share Resources via .get()#4579
Fix SSR Suspense livelock when multiple boundaries share Resources via .get()#4579alilee wants to merge 1 commit intoleptos-rs:mainfrom
Conversation
…a .get() The Suspense Effect's state machine returned `false` as a catch-all default, which reset `double_checking` from `Some(true)` back to `Some(false)` when the Effect was woken by intermediate monitor completions that left tasks non-empty. This caused `dry_resolve()` to be called again when tasks drained, spawning new monitors in an infinite cycle (~297k closure invocations per boundary per 5s). The fix: replace the catch-all `false` with `double_checking.unwrap_or(false)`, preserving the `Some(true)` state through intermediate Effect wakeups so the completion branch is reached correctly. Manifests on AWS Lambda (128MB, 2 cores) as intermittent 500 errors. Reproduces reliably with `worker_threads = 2` or `3`, works with 1 or 4+.
|
I've tried to run the regression test suite, but I'm not familiar with the ci so I'm not confident I've fully leveraged the available testing - apologies in advance. Happy to do more work as you direct. I think you might want to get ready for an avalanche of Claude Code originated PR's and set ultra-convenient standards for yourself, and maybe even provide prompts or agent commands to do what you need so that you can sit on the top - surgeon to all us nurses. You should feel completely justified in asking for people to invest their tokens on tasks! I really hope the shift is net beneficial for you and leptos. |
|
I'm having trouble reproducing the bug, either against the main branch with the reproduction you provided in the issue or in the PR branch by reverting the change and running the test. The |
|
Ah ok, interesting. I am able to reproduce it against the latest published version, but not again git main. Could you confirm whether the bug occurs against the current state of the main branch? I am about to publish a new set of patch releases in any case. |
Summary
One-line fix for an SSR livelock where 2+
Suspense/Transitionboundaries sharing 2+ Resources via.get()causes the Suspense Effect to endlessly re-rundry_resolve(), consuming 100% CPU (~297k closure invocations per boundary per 5s) and never completing SSR.Manifests on AWS Lambda (128MB, 2 cores →
worker_threads=2) as intermittent 500 errors.Root Cause
The Effect's isomorphic closure returns a
boolthat becomesdouble_checking: Option<bool>. Afterdry_resolve()returnstrue(settingdouble_checking = Some(true)), the Effect is woken by individual monitor completions. If tasks aren't empty yet, the catch-allfalseat the end resetsdouble_checkingtoSome(false). When tasks finally drain, the Effect seesdouble_checking == Some(false)and callsdry_resolve()again — infinite cycle.The Fix
leptos/src/suspense_component.rs, line 409:This preserves
Some(true)through intermediate wakeups so the completion branch is reached:Only the
Some(true)→ non-empty-tasks path changes behavior. All other paths return the same value as before.Reproduction
Minimal repro at https://github.com/alilee/feedback/tree/main/bugs/ssr-race (Cargo.toml + src/main.rs).
Thread count dependency:
worker_threadsTesting
leptos/tests/ssr_suspense_livelock.rs— 2 Suspense boundaries, 2 shared Resources,worker_threads=2, 2s timeoutleptoscrate tests pass (72 tests: 58 doc + 11 ssr + 2 pr_4061 + 1 new)