Skip to content

Commit 807535d

Browse files
duncdrumclaude
andcommitted
[bugfix] Fix cancellation test: GC-free query and wider CI timeout
The previous query (for $i in 1 to 999999999 return string($i)) allocated 999999999 String objects, triggering stop-the-world GC pauses several seconds long on CI. During a GC pause the query thread cannot call proceed(), so the volatile terminate flag goes unobserved for the duration of the pause — causing the 5s cancel window to expire before the flag is seen, even though the volatile fix itself is correct. Two fixes: 1. Change query to 'for $i in 1 to 999999999 return ()': no per-iteration allocation, no GC pressure. The query thread stays schedulable and calls proceed() millions of times per second, so the volatile flag is observed within microseconds of kill(). 2. Widen cancelledLatch timeout from 5s to 10s and raise max-execution-time to 30s (genuine fallback only). The max-execution-time was previously set to 5000ms — equal to our latch timeout — so its error message was racing the latch expiry and arriving after the test had already given up. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 4f859bd commit 807535d

1 file changed

Lines changed: 13 additions & 7 deletions

File tree

exist-core/src/test/java/org/exist/http/ws/EvalWebSocketEndpointTest.java

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -475,11 +475,16 @@ public void onMessage(final String message) {
475475
}, createAdminConfig(), getWsUri());
476476

477477
try {
478-
// Start a long-running query
478+
// GC-free query: return () produces no objects per iteration, so the JVM
479+
// stays out of stop-the-world GC and the query thread calls proceed() on
480+
// every iteration — letting the volatile terminate flag be observed within
481+
// microseconds of kill(). String-producing variants (string($i)) cause
482+
// heavy GC that can stall the query thread for several seconds on CI.
483+
// max-execution-time is a safety net only; the cancel should fire first.
479484
session.getBasicRemote().sendText(
480485
"{\"action\":\"eval\",\"id\":\"q-cancel\"," +
481-
"\"query\":\"let $x := for $i in 1 to 999999999 return string($i) return $x\"," +
482-
"\"max-execution-time\":5000}");
486+
"\"query\":\"for $i in 1 to 999999999 return ()\"," +
487+
"\"max-execution-time\":30000}");
483488

484489
// Wait for the server to confirm the query is executing before cancelling.
485490
// Without this, the cancel may arrive before the watchdog is registered and
@@ -490,10 +495,11 @@ public void onMessage(final String message) {
490495
session.getBasicRemote().sendText(
491496
"{\"action\":\"cancel\",\"id\":\"q-cancel\"}");
492497

493-
// With terminate declared volatile in XQueryWatchDog, cancellation is
494-
// near-instant once proceed() is next called; 5s gives ample headroom.
495-
assertTrue("Should receive cancelled/error within 5s",
496-
cancelledLatch.await(5, TimeUnit.SECONDS));
498+
// With terminate declared volatile in XQueryWatchDog and no GC pressure
499+
// from the query, cancellation is visible at the very next proceed() call
500+
// — microseconds after kill(). 10s gives ample CI headroom.
501+
assertTrue("Should receive cancelled/error within 10s",
502+
cancelledLatch.await(10, TimeUnit.SECONDS));
497503
assertEquals("q-cancel", cancelledMsg.get().get("id"));
498504
} finally {
499505
session.close();

0 commit comments

Comments
 (0)