Skip to content

Deadlock with worker threads and TLS createSecureContext #52698

Open
@abrenneke

Description

@abrenneke

Version

  • v20.11.0
  • v21.7.3

Platform

Darwin IRONCLADHQ-NMXQRHY6C 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:55:06 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6020 arm64

Subsystem

crypto?

What steps will reproduce the bug?

I believe the following script reproduces the deadlock:

const { Worker, isMainThread } = require('worker_threads');
const { createSecureContext } = require('tls');

if (isMainThread) {
    for (let i = 0; i < 1000; i++) {
        new Worker(__filename);
    }

    let i = 0;

    const go = () => {
        createSecureContext();
        if (i % 1000 === 0) {
            console.log(i);
        }
        i++;
        setImmediate(go);
    }

    go();
} else {
    const go = () => {
        createSecureContext();
        setImmediate(go);
    }

    go();
}

Here is what I get when running the script:

% node index.js                               
0
1000
2000
3000
4000
5000
6000
^C%                                                                                                                                                           
% node index.js
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
^C%                                                                                                                                                           
% node index.js
0
1000
2000
3000
4000
5000
6000
7000

Here is what the Mac OS process sample shows:

image

Attached here is the full sample for the above process
deadlock.txt

How often does it reproduce? Is there a required condition?

The above script seems to reliably reproduce for me. This may be specific to apple silicon, I'm not sure!

What is the expected behavior? Why is that the expected behavior?

I would expect the above script to run infinitely, and print incrementing values of 1000.

What do you see instead?

The above script freezes after 6-10,000 iterations of the main process. All threads end up deadlocked somehow.

Additional information

This is happening in a production application for us that only has 2 worker threads, it takes as little as 20 minutes, so I'm not sure how it's actually happening to us, but the above script forces the problem. Nobody would create this many secure contexts in practice, probably.

Metadata

Metadata

Assignees

No one assigned

    Labels

    opensslIssues and PRs related to the OpenSSL dependency.tlsIssues and PRs related to the tls subsystem.workerIssues and PRs related to Worker support.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions