-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Context
I have a handler logic that roughly looks something like this:
// Call the runner service
const res = await tryCatch(runner.run(jobData));
// Handle RPC-level errors (e.g., runner service unavailable)
// ...
if (res.error instanceof restate.CancelledError) {
throw res.error;
}
// Notify the runner of the RPC error
await tryCatch(
runner.onError({
job: jobData,
error: {
name:
res.error instanceof Error ? res.error.name : "RPCError",
message: res.error instanceof Error ? res.error.message : String(res.error),
stack:
res.error instanceof Error ? res.error.stack : undefined,
},
}),
);The handler calls a restate service, and if it hits any exceptions, it calls another service with the details about this error (name, message and stacktrace). The main assumption here is that during replay, the error details are going to be deterministic.
Problem
When the main service handler fails with terminal error(e.g: network error, the service is unavailable), this seems to result into a mismatch into the journal entry caused by a mismatch in the stack trace returned during replay:
[570] Found a mismatch between the code paths taken during the previous execution and the paths taken during this execution.
This typically happens when some parts of the code are non-deterministic.
- The mismatch happened while executing 'call' (index '14')
- Difference:
parameter: '{"job":{"id":"inv_1bDIFRtrYMPl0DXqSC9RamS0cM5c2nXNe1","data":{"bookmarkId":"ft760zlpdg46o1n49e2cg0vz"},"priority":50,"runNumber":0,"numRetriesLeft":5,"timeoutSecs":60},"error":{"name":"TerminalError","message":"unexpected error while reading the response body: error reading a body from connection","stack":"TerminalError: unexpected error while reading the response body: error reading a body from connection\n at Failure (/app/node_modules/@restatedev/restate-sdk/dist/context_impl.js:421:15)\n at RestateSinglePromise.completer (/app/node_modules/@restatedev/restate-sdk/dist/context_impl.js:380:50)\n at runNextTicks (node:internal/process/task_queues:65:5)\n at process.processImmediate (node:internal/timers:472:9)\n at async RestateSinglePromise.tryComplete (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:73:3)\n at async PromisesExecutor.doProgressInner (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:197:4)\n at async <anonymous> (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:216:5)"}}' != '{"job":{"id":"inv_1bDIFRtrYMPl0DXqSC9RamS0cM5c2nXNe1","data":{"bookmarkId":"ft760zlpdg46o1n49e2cg0vz"},"priority":50,"runNumber":0,"numRetriesLeft":5,"timeoutSecs":60},"error":{"name":"TerminalError","message":"unexpected error while reading the response body: error reading a body from connection","stack":"TerminalError: unexpected error while reading the response body: error reading a body from connection\n at Failure (/app/node_modules/@restatedev/restate-sdk/dist/context_impl.js:421:15)\n at RestateSinglePromise.completer (/app/node_modules/@restatedev/restate-sdk/dist/context_impl.js:380:50)\n at async RestateSinglePromise.tryComplete (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:73:3)\n at async PromisesExecutor.doProgressInner (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:197:4)\n at async <anonymous> (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:216:5)"}}'
Where the calls are the following:
After beautifying the stacks:
Original execution:
TerminalError: unexpected error while reading the response body: error reading a body from connection
at Failure (/app/node_modules/@restatedev/restate-sdk/dist/context_impl.js:421:15)
at RestateSinglePromise.completer (/app/node_modules/@restatedev/restate-sdk/dist/context_impl.js:380:50)
at runNextTicks (node:internal/process/task_queues:65:5) ← EXTRA
at process.processImmediate (node:internal/timers:472:9) ← EXTRA
at async RestateSinglePromise.tryComplete (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:73:3)
at async PromisesExecutor.doProgressInner (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:197:4)
at async <anonymous> (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:216:5)
Replay execution:
TerminalError: unexpected error while reading the response body: error reading a body from connection
at Failure (/app/node_modules/@restatedev/restate-sdk/dist/context_impl.js:421:15)
at RestateSinglePromise.completer (/app/node_modules/@restatedev/restate-sdk/dist/context_impl.js:380:50)
at async RestateSinglePromise.tryComplete (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:73:3)
at async PromisesExecutor.doProgressInner (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:197:4)
at async <anonymous> (/app/node_modules/@restatedev/restate-sdk/dist/promises.js:216:5)
Notice how the original execution contained some node internal frames, while the replay didn't.
It seems that the SDK doesn't store in the journal the stacktraces of terminal failures (only code and message) and instead attempts to reconstruct them during replay. But this seems like it can be a constant source of indeterminism that it makes me wonder if it's possible for the sdk to nullify it or something to prevent users from shooting themselves in the foot.
Thanks!
Env
Node: 24
Sdk Version: 1.10.3
Server version: 1.6.1