fix(jsonata): prevent StackOverflowError on small JVM stacks via eval thread isolation + lower default maxDepth#80
Open
fdelbrayelle wants to merge 10 commits intomainfrom
Open
Conversation
Reproduces dashjoin/jsonata-java#107. Frame.lookup() recurses once per scope frame — on 256 KB (Windows) or 512 KB (Linux default) thread stacks, a 999/1999-deep recursive JSONata expression overflows before maxDepth fires. Tests use explicit Thread stack sizes so the scenario is reproducible on any platform. Passes once jsonata-java ships the iterative Frame.lookup() fix.
…ack helper Both Windows and Linux tests now use SMALL_STACK_BYTES = 256 KB via Thread(stackSize) so no -Xss JVM flag is needed in build config. Extract runOnSmallStack() helper to avoid duplication.
Contributor
Tests now assert isInstanceOf(StackOverflowError.class) — current behavior without the upstream fix. Rename constant to XSS_256K and add Javadoc linking it to -Xss256k so the stack constraint is self-documenting. Comment marks where assertions flip once dashjoin/jsonata-java#107 ships.
Thread(stackSize) is advisory — JVM ignores it on Linux. Add -Xss256k to test JVM args in plugin-transform-json/build.gradle via afterEvaluate so it appends after the root build's jvmArgs= Allure assignment. Simplify tests to call task.run() directly and assert StackOverflowError with assertThatThrownBy.
…eplace)
afterEvaluate in the subproject was silently wiped because the root Allure
subprojects{} block used jvmArgs=[...] (full assignment). Change root to
jvmArgs append; subproject test{jvmArgs '-Xss256k'} then safely extends it
without ordering ambiguity.
…tack -Xss256k is silently ignored by OpenJDK on Linux 64-bit (enforces a higher minimum). Instead use depth=4999/9999 which overflows the default 512 KB-1 MB thread stack without any JVM flag (~150 bytes/frame × 9999 ≈ 1.5 MB needed).
… in CI Linux default thread stack is ~8 MB (OS RLIMIT_STACK), far too large to reproduce the crash without -Xss. Pin test JVM to -Xss512k (above HotSpot minimum, safe for Kestra framework). Use depth=49999 so 49999 JVM frames × minimum 16 bytes/frame = 800 KB > 512k — overflow guaranteed regardless of JIT optimization level.
…t TCO jsonata-java TCO-optimises tail-recursive calls into a loop (O(1) stack), so $f($n-1) never caused a stack overflow. Adding + 0 after the recursive call puts it in non-tail position, forcing each frame to stay live and producing the expected StackOverflowError on -Xss512k.
Tests document the current bug state: StackOverflowError is expected. Will flip to assertThatNoException once dashjoin/jsonata-java#107 ships.
Each JSONata recursion level pushes ~8 JVM frames. On 256 KB worker stacks (~300 usable frames), the old default maxDepth=200 allowed 200 × 8 = 1600 frames before the bounds check fired — far past overflow. Two-layer fix: 1. Lower default maxDepth 200 → 50 (50 × 8 = 400 frames, safe on 256 KB). setRuntimeBounds fires at depth 50, throwing JException cleanly. 2. Run evaluate() on a dedicated thread with an explicit 4 MB stack. Worker thread stack size (e.g. 256 KB on Windows) can no longer constrain the evaluator. If a user sets a very high maxDepth and triggers StackOverflowError anyway, it is caught as Throwable inside the throwaway eval thread; the worker thread gets a clean RuntimeException instead of crashing. Update regression tests: - Parametrized test: maxDepth=50/200/500/1000 all produce JException, never StackOverflowError, even on -Xss512k JVM. - Isolation test: maxDepth=50000 with $f(49999) overflows the eval thread but worker receives RuntimeException(cause=StackOverflowError). Closes dashjoin/jsonata-java#107 dependency — fix is now self-contained in plugin-transform without requiring upstream changes.
| // 2. Edge case (user sets very high maxDepth): if a StackOverflowError occurs in the eval | ||
| // thread, it is contained there. The worker thread reads the stored error and throws a | ||
| // clean RuntimeException — the worker never crashes. | ||
| var thread = new Thread(null, () -> { |
Member
There was a problem hiding this comment.
so each record allocates a new Thread with a 4 MB stack, for batch workloads this could be a regression?
| try { | ||
| var result = this.parsedExpression.evaluate(data, frame); | ||
| resultRef.set(result != null ? MAPPER.valueToTree(result) : NullNode.getInstance()); | ||
| } catch (Throwable t) { |
Member
There was a problem hiding this comment.
Should we narrow it down?
Malaydewangan09
approved these changes
May 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Jsonata$Frame.lookup()indashjoin/jsonata-javawalks the parent scope chain via tail recursion. Java does not optimize tail calls, so every JSONata recursive function call pushes ~8 JVM frames onto the thread stack. With the old defaultmaxDepth=200, the evaluator could push 200 × 8 = 1,600 JVM frames before the bounds check fired — far past what a 256 KB worker stack (~300 usable frames) can hold.StackOverflowErroris a JVMError, not anException— it escapes all normaltry/catchblocks and crashes the worker thread.Why upstream fix wasn't enough: dashjoin/jsonata-java#107 (iterative
lookup()) was reviewed by the maintainer who confirmed the iterative change removes only D≈4 frames from the top of the stack. Since overflow is driven by N × frames-per-eval-level (not D), the upstream PR was not merged. The fix must live here.Fix
Two-layer defense:
1. Lower default
maxDepth200 → 5050 × ~8 JVM frames = ~400 frames — safe on a 256 KB stack (~300 usable frames).
setRuntimeBoundsnow fires and throwsJExceptionbefore any stack risk.Users with legitimate deep recursion can override
maxDepth, but should also increase the JVM stack size (-Xss).2. Dedicated eval thread with 4 MB stack
evaluate()runs onnew Thread(null, runnable, "jsonata-eval", 4 * 1024 * 1024). This means:StackOverflowErroroccurs in the eval thread, it is caught asThrowablein that throwaway thread. The worker thread reads the stored error and throws a cleanRuntimeException— the worker never crashes.This is safe because the SOE is contained inside a thread we immediately discard. No JVM state is shared between the eval thread's stack and the worker thread.
Regression tests
shouldNeverThrowStackOverflowForCommonMaxDepthValues(parametrized: 50, 200, 500, 1000)JException→RuntimeException, neverStackOverflowError, even on-Xss512kJVMshouldIsolateStackOverflowInEvalThreadWhenMaxDepthExceedsStackCapacity$f(49999)withmaxDepth=50000overflows eval thread → worker getsRuntimeException(cause=StackOverflowError)Test plan
RuntimeException(cause=JException)on-Xss512kJVMRuntimeException(cause=StackOverflowError), test JVM does not crash