Description
- Version: 10.7.0
- Platform: OSX
Note: Sorry this got so long but I think it does help to illustrate the flow and process that brought this issue up! Please bare with me as I do feel this may be part of a larger issue at this point?
Update: Added a repro async_hooks_repro
Hey Guys!
So I am playing with async_hooks
and definitely some powerful stuff here! It's a bit hard to grasp at first and seems fairly prone to potential problems for people using it to cause memory leaks and such.
I think one source of that is that the destroy
is called when the garbage collector actually collects a given resource while the after
doesn't fire for many things. This seems like it will lead towards confusion and people likely causing larger problems in their applications that may be difficult to see (with the garbage collector which is notoriously difficult to troubleshoot - no complaints there though, understand fully).
I was wondering if it makes sense to have another hook for complete
which is similar to after
but is guaranteed to fire once the asynchronous task itself has finished its job. This may not be possible for whatever reason but it does seem to fit better with how it feels it should operate.
To illustrate, just want to show what my experience as I was trying to grasp what was going on was as it may help to hear from avg user that doesn't necessarily understand the intimate details of the gc may approach this.
The Project
So in order to learn this, I wanted to build what most others already have built for the most part and make sure I understood the overall flow of everything. Basically a replacement for the node-continuation library.
Essentially a way to create an asynchronous state that is available for the length of a scope regardless of asynchronous calls made in-between.
I have a couple general goals:
- Allow "contexts" (namespace) to be created which are used to capture the state throughout a lifecycle.
- Enable the created hooks only when we have a running asynchronous context
- Disable the created hooks as soon as we are no longer executing
Test Code
I ended up adding logs (using fs.writeSync) to each hook to see where and when things occur so I could grasp the exact flow. My test code for a simple example here looks like this:
const ctx = require('./dist').default;
const log = require('./dist/log').default;
function fn() {
// outputs using fs.writeSync(1, message)
log(`
Context State: ${JSON.stringify(ctx.get('foo').state())}
`);
}
ctx.run('foo', () => {
ctx.get('foo').set('myvar', 'value3');
setTimeout(fn);
});
In this simple example it works exactly as I'd expect, outputting the following:
[ENABLING HOOKS]
[START]: Context Base EID: 5
1533096660930 | [INIT] | Timeout | asyncId 6 | trigger: 5
1533096660930 | [INIT] | TIMERWRAP | asyncId 7 | trigger: 5
Context State: {"root":5,"myvar":"value3"}
1533096660931 | [AFTER] | asyncId 6
1533096660931 | [AFTER] | asyncId 7
1533096660931 | [DESTROY] | asyncId 6 | remaining 1 | ids left: 7
1533096660931 | [DESTROY] | asyncId 7 | remaining 0 | ids left:
[DISABLING HOOKS]
In case there is confusion, the "state" value includes whatever is set on it and a key "root" which saves the root eid that started the context.
Where it gets tricky
This may be less of an issue than I'm worried about, but where I started to run into confusion was when other things started going on within the example. For example, if I add a timeout outside of the entire context/hooks process, now we run into a larger issue:
Note: This is where I actually solved the problem but the solution is so subtle and it is very hard to understand exactly what was happening here. I think it likely still should have better handling in some way so that leaks are not so simple to occur.
So my issue came in when I then wanted to embed a call to run within another call to run. When I would do this, each time I ended up with a memory leak situation where a single value was never calling after
, promiseResolve
, or destroy
.
Script
function fn() {
// outputs using fs.writeSync(1, message)
log(`
Context State: ${JSON.stringify(ctx.get('foo').state())}
`);
}
ctx.run('foo', () => {
ctx.get('foo').set('myvar', 'value1');
setTimeout(fn);
ctx.run('foo', () => {
ctx.get('foo').set('myvar', 'value2');
setTimeout(fn);
});
});
setTimeout(() => {
console.log(`${String(Date.now())} | EVALUATION COMPLETED`);
}, 8000);
Outputs
[ENABLING HOOKS]
[START]: Context Base EID: 5
1533098775911 | [INIT] | Timeout | asyncId 6 | trigger: 5
1533098775911 | [INIT] | TIMERWRAP | asyncId 7 | trigger: 5
1533098775911 | [INIT] | PROMISE | asyncId 8 | trigger: 5
1533098775911 | [INIT] | PROMISE | asyncId 9 | trigger: 5
[START]: Context Base EID: 10
1533098775912 | [INIT] | Timeout | asyncId 11 | trigger: 10
1533098775912 | [AFTER] | asyncId 10
1533098775912 | [DESTROY] | asyncId 10 | remaining 6 | ids left: 5, 6, 7, 8, 9, 11
1533098775912 | [RESOLVE] | asyncId 9
1533098775912 | [DESTROY] | asyncId 9 | remaining 5 | ids left: 5, 6, 7, 8, 11
1533098775912 | [AFTER] | asyncId 5
1533098775912 | [DESTROY] | asyncId 5 | remaining 4 | ids left: 6, 7, 8, 11
Context State: {"root":5,"myvar":"value1"}
1533098775913 | [AFTER] | asyncId 6
1533098775913 | [DESTROY] | asyncId 6 | remaining 3 | ids left: 7, 8, 11
Context State: {"root":10,"myvar":"value2"}
1533098775913 | [AFTER] | asyncId 11
1533098775913 | [DESTROY] | asyncId 11 | remaining 2 | ids left: 7, 8
1533098775913 | [AFTER] | asyncId 7
1533098775913 | [DESTROY] | asyncId 7 | remaining 1 | ids left: 8
1533098783913 | EVALUATION COMPLETED
So here we seem to have a problem. A promise (asyncId 8) is never being handled by our hooks. In order to evaluate further, I ended up saving id 8 during init to global and later logging it so it wouldn't be caught by console.log and it ended up showing that there is a Promise that is never resolved, it remains pending forever.
What is weird and I still don't fully grasp, is that the problem was with an async function
which actually really wasn't using any async features (my original hope was that .run()
would return a promise outside of the asynchronous context - but that proved pointless.
Still, this tiny little problem would cause memory leaks that 99% of people probably would never be aware existed.
So ctx.run(...args)
looked like this:
export async function runInContext(contextID, fn, ...args) {
const context = AsyncContexts.get(contextID) || createAsyncContext(contextID);
const runner = new ContextRoot(context, fn);
runner.run(args);
}
export default Object.freeze({
create: createAsyncContext,
get: getContext,
run: runInContext,
});
This is the promise that will remain pending forever (async id 8). This is a really odd place in my mind for the problem to be happening - I would imagine that this promise would be resolved immediately after the 3 lines shown are completed. The functions that it calls do complete within a millisecond so the entire body of runInContext
absolutely is finished right away.
In Summary
Just wanted to say thanks, I know it got long but I actually feel this is either part of a larger bug or might help me understand what about the gc would cause this type of thing. We have ongoing issues with gc issues freezing up a larger application that I have been brought on to help and the problem seems largely due to Promises as well so I am wondering if there is a quirk about them and the gc that I am not fully grasping that could help in both areas here.
Anyway, by just making runInContext
a function instead of async function
it appears to work as expected for the most part:
[ENABLING HOOKS]
[START]: Context Base EID: 5
1533099546996 | [INIT] | Timeout | asyncId 6 | trigger: 5
1533099546997 | [INIT] | TIMERWRAP | asyncId 7 | trigger: 5
[START]: Context Base EID: 8
1533099546997 | [INIT] | Timeout | asyncId 9 | trigger: 8
1533099546997 | [AFTER] | asyncId 8
1533099546997 | [DESTROY] | asyncId 8 | remaining 4 | ids left: 5, 6, 7, 9
1533099546997 | [AFTER] | asyncId 5
1533099546997 | [DESTROY] | asyncId 5 | remaining 3 | ids left: 6, 7, 9
Context State: {"root":5,"myvar":"value1"}
1533099546998 | [AFTER] | asyncId 6
1533099546998 | [DESTROY] | asyncId 6 | remaining 2 | ids left: 7, 9
Context State: {"root":8,"myvar":"value2"}
1533099546998 | [AFTER] | asyncId 9
1533099546998 | [DESTROY] | asyncId 9 | remaining 1 | ids left: 7
1533099546998 | [AFTER] | asyncId 7
1533099546998 | [DESTROY] | asyncId 7 | remaining 0 | ids left:
[DISABLING HOOKS]
1533099554997 | EVALUATION COMPLETED