Skip to content

Commit be531d9

Browse files
fix: move safety timeout to connect event (#111)
The ndt7-js client sets a 12s safety timeout when the worker is created (`runNDT7Worker`, line 206). This timeout covers the entire worker lifecycle including WebSocket connection setup (TCP + TLS + WS upgrade ≈ 3 RTTs). For a 10s test with a 2s margin, clients with RTT > ~667ms have their tests terminated early by the client before the server's 10s duration expires. This is the same bug fixed in msak-js (m-lab/msak-js#23). BQ data confirms the effect (ndt7 downloads, one week sample Feb 2025): | RTT bucket | Median elapsed (of 10s) | Median shortfall | % tests < 9s | |---|---|---|---| | <20ms | 9.79s | 0.21s | 38% | | 200-500ms | 9.20s | 0.80s | 47% | | 500ms+ | **5.84s** | **4.17s** | **78%** | The fix splits the single 12s timeout into two: - **Connection timeout (10s)**: set at worker creation, protects against WebSocket connections that never open. Cleared when the worker reports `start` (i.e. `sock.onopen`). - **Test duration timeout (12s)**: set when the `start` message arrives, so handshake time does not eat into the test duration. <!-- Reviewable:start --> - - - This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/m-lab/ndt7-js/111) <!-- Reviewable:end -->
1 parent 6c98f16 commit be531d9

File tree

1 file changed

+17
-7
lines changed

1 file changed

+17
-7
lines changed

src/ndt7.js

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -198,23 +198,32 @@
198198
};
199199
});
200200

201-
// If the worker takes 12 seconds, kill it and return an error code.
202-
// Most clients take longer than 10 seconds to complete the upload and
203-
// finish sending the buffer's content, sometimes hitting the socket's
204-
// timeout of 15 seconds. This makes sure uploads terminate on time and
205-
// get a chance to send one last measurement after 10s.
206-
const workerTimeout = setTimeout(() => worker.resolve(0), 12000);
201+
// Connection timeout: if the WebSocket never opens (and never
202+
// fires onerror/onclose), resolve after 10s to prevent the
203+
// promise from hanging indefinitely.
204+
const connTimeout = setTimeout(() => worker.resolve(0), 10000);
205+
206+
// The test duration timeout is set when the worker reports
207+
// 'start' (i.e. sock.onopen), so that WS+TLS handshake time
208+
// does not eat into the test duration.
209+
let workerTimeout;
207210

208211
// This is how the worker communicates back to the main thread of
209212
// execution. The MsgTpe of `ev` determines which callback the message
210213
// gets forwarded to.
211214
worker.onmessage = function(ev) {
212215
if (!ev.data || !ev.data.MsgType || ev.data.MsgType === 'error') {
216+
clearTimeout(connTimeout);
213217
clearTimeout(workerTimeout);
214218
worker.resolve(1);
215219
const msg = (!ev.data) ? `${testType} error` : ev.data.Error;
216220
callbacks.error(msg);
217221
} else if (ev.data.MsgType === 'start') {
222+
clearTimeout(connTimeout);
223+
// Safety timeout anchored to connect, not worker creation.
224+
// 12s gives the server's 10s duration time to finish plus
225+
// margin for the upload buffer to drain.
226+
workerTimeout = setTimeout(() => worker.resolve(0), 12000);
218227
callbacks.start(ev.data.Data);
219228
} else if (ev.data.MsgType == 'measurement') {
220229
// For performance reasons, we parse the JSON outside of the thread
@@ -241,7 +250,8 @@
241250
// We can't start the worker until we know the right server, so we wait
242251
// here to find that out.
243252
const urls = await urlPromise.catch((err) => {
244-
// Clear timer, terminate the worker and rethrow the error.
253+
// Clear timers, terminate the worker and rethrow the error.
254+
clearTimeout(connTimeout);
245255
clearTimeout(workerTimeout);
246256
worker.resolve(2);
247257
throw err;

0 commit comments

Comments
 (0)