Skip to content

esm: syncify default path of ModuleLoader.load #57419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

JakobJingleheimer
Copy link
Member

@JakobJingleheimer JakobJingleheimer commented Mar 12, 2025

In #57390 I somehow inadvertently created a branch on the node repo instead of my fork, and then got stuck.

Parent issue: #55782

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/loaders

@nodejs-github-bot nodejs-github-bot added esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run. labels Mar 12, 2025
Copy link

codecov bot commented Mar 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.27%. Comparing base (5a2614f) to head (7c5787a).
Report is 164 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #57419      +/-   ##
==========================================
+ Coverage   90.23%   90.27%   +0.04%     
==========================================
  Files         630      630              
  Lines      185074   186139    +1065     
  Branches    36221    36467     +246     
==========================================
+ Hits       166994   168030    +1036     
+ Misses      11036    10980      -56     
- Partials     7044     7129      +85     
Files with missing lines Coverage Δ
lib/internal/modules/esm/load.js 94.31% <100.00%> (+2.36%) ⬆️
lib/internal/modules/esm/loader.js 95.04% <100.00%> (ø)
lib/internal/modules/helpers.js 99.08% <100.00%> (+0.24%) ⬆️

... and 134 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@marco-ippolito
Copy link
Member

I investigated a bit on the weird failure and apparently there is some scheduling issue.
it can be patched with marco-ippolito@e0d702b#diff-b4c24f634e3741e5ad9e8c29864a48f2bd284a9d66d8fed3d077ccee0c44087bR968 but idk why it happens

@JakobJingleheimer
Copy link
Member Author

Sweet, thanks! I was gonna take a look this evening. I'll apply it as soon as I finish work 🙂

@JakobJingleheimer JakobJingleheimer force-pushed the esm/syncify-ModuleLoader-load branch from 0bcfae3 to e7941fe Compare March 19, 2025 08:57
Copy link
Member Author

@JakobJingleheimer JakobJingleheimer Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test's setup is the actual issue (I'm not sure why it was ever working 🤔):

  • global-hooks.js contains a non-awaited dynamic import at the root: import('node:test').then((test) => …), which left a dangling promise (so node can't guarantee the global file has finished executing before starting with the entry-point). It was probably done that way because TLA is not available in CJS, and the file is shared by both test cases.
  • The global hooks setup file --import/--require'd itself is indeed run before the test files (added console.log('\n==global hooks==\n') at line 1, and it prints first).

I suspect it was set up this way in order to re-use for both cjs and esm. @avivkeller it looks like you authored this test ~6 months ago. Would you perchance remember whether there was another reason? I can't think of one, but if there is, my change could inadvertently cause something to no-longer be covered.

Splitting global-hooks.js into global-hooks.cjs + require & global-hooks.mjs + static import, the sequence issue appears to be fixed.

However, a few of the global test hooks are now just not printing at all for the --require case (red lines are missing in actual vs expected):

before(): global
before one: <root>
suite one
before two: <root>
suite two
- beforeEach(): global
beforeEach one: suite one - test
beforeEach two: suite one - test
suite one - test
- afterEach(): global
afterEach one: suite one - test
afterEach two: suite one - test
before suite two: suite two
- beforeEach(): global
beforeEach one: suite two - test
beforeEach two: suite two - test
suite two - test
- afterEach(): global
afterEach one: suite two - test
afterEach two: suite two - test
- after(): global
after one: <root>
after two: <root>

Copy link
Member

@avivkeller avivkeller Mar 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't see this earlier!

I suspect it was set up this way in order to re-use for both cjs and esm.

Exactly—using a single test file for both ESM and CJS helped keep things simpler, as I could simply flip-flop the --import vs --require call.


That’s definitely an odd issue you’re encountering. I have a few theories (though no concrete evidence to back them up):

  • It’s possible that an error occurs in the chain for all but before, though that seems unlikely.
  • Another possibility is a race condition. Could it be that before runs successfully simply because it’s the first one executed in the file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Race condition seems the most likely candidate since this PR is affecting a/sync behaviour. Strange that only this test is failing though—I would have expected a bunch of failures.

@avivkeller
Copy link
Member

avivkeller commented Mar 22, 2025

PTAL @nodejs/test_runner

We're encountering an issue where a global hook script loaded via --require does not execute all hooks—only before runs—whereas loading it via --import works as expected.

@JakobJingleheimer and I investigated the root cause, but since I haven’t worked on the test runner in a while, we could use some help.

We've determined that the issue stems from the test runner's loading of CJS files. When modifying the following code in runner.js (used to load ESM files):

const userImports = getOptionValue('--import').concat(getOptionValue('--require'));
for (let i = 0; i < userImports.length; i++) {
  await cascadedLoader.import(userImports[i], parentURL, kEmptyObject);
}

(This would need primordials, but we are only testing right now)

If I run a test with a --require-d before and after hook, we observe:

  • The before hook executes twice
  • The after hook executes once

This suggests that the initial CJS load (which we haven’t yet pinpointed) runs and executes only the first before hook, while the cascadedLoader.import call correctly executes all hooks.

We need help identifying where the CJS load originally happens and whether we can remove it in favor of the snippet above or modify it to use the ESM loader.

@JakobJingleheimer
Copy link
Member Author

thanks @avivkeller! The issue actually is actually present is a simpler change (merely the dangling promise fix) #57595 So that means it's unrelated to this change 😁 Let's move the discussion over to that PR.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mcollina mcollina added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 1, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 1, 2025
@nodejs-github-bot
Copy link
Collaborator

@JakobJingleheimer JakobJingleheimer force-pushed the esm/syncify-ModuleLoader-load branch from a2d0fcc to f60650f Compare April 2, 2025 21:14
@JakobJingleheimer JakobJingleheimer added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 3, 2025
@github-actions github-actions bot added request-ci-failed An error occurred while starting CI via request-ci label, and manual interventon is needed. and removed request-ci Add this label to start a Jenkins CI on a PR. labels Apr 3, 2025
Copy link
Contributor

github-actions bot commented Apr 3, 2025

Failed to start CI
   ⚠  Commits were pushed since the last approving review:
   ⚠  - esm: syncify default path of `ModuleLoader.load`
   ⚠  - !fixup: don't force `ModuleLoader.load` to be async (can via hooks)
   ⚠  - !fixup: remove obsolete `getSource` in favour of `getSourceSync`
   ⚠  - !fixup: account for `node:internal` in `isMain` check
   ✘  Refusing to run CI on potentially unsafe PR
https://github.com/nodejs/node/actions/runs/14236594120

@@ -965,7 +965,7 @@ function defaultResolve(specifier, context = {}) {
if (protocol === 'node:') { return { __proto__: null, url: specifier }; }


const isMain = parentURL === undefined;
const isMain = parentURL === undefined || StringPrototypeStartsWith(parentURL, 'node:internal/');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment explaining why we want isMain to be true for internal modules?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It came from here: #57419 (comment)

TLDR: we don't know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the weird failure exactly? This addition does look strange, or could be painting over a bug...

This comment was marked as outdated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, this looks familiar.

https://github.com/nodejs/node/actions/runs/14280143011/job/40028680516?pr=57419#step:11:5939

Path: es-module/test-vm-main-context-default-loader
Error: --- stderr ---
(node:66638) ExperimentalWarning: vm.USE_MAIN_CONTEXT_DEFAULT_LOADER is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
node:internal/process/promises:394
    triggerUncaughtException(err, true /* fromPromise */);
    ^

AssertionError [ERR_ASSERTION]: function should not have been called at /Users/runner/work/node/node/node/test/es-module/test-vm-main-context-default-loader.js:146
called with arguments: AssertionError [ERR_ASSERTION]: Expected values to be strictly deep-equal:
+ actual - expected

  Comparison {
+   code: 'ERR_UNSUPPORTED_RESOLVE_REQUEST'
-   code: 'ERR_MODULE_NOT_FOUND'
  }

    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async main (/Users/runner/work/node/node/node/test/es-module/test-vm-main-context-default-loader.js:124:7) {
  generatedMessage: true,
  code: 'ERR_ASSERTION',
  actual: [TypeError],
  expected: [Object],
  operator: 'rejects'
}
    at mustNotCall (/Users/runner/work/node/node/node/test/common/index.js:493:12)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5) {
  generatedMessage: false,
  code: 'ERR_ASSERTION',
  actual: undefined,
  expected: undefined,
  operator: 'fail'
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joyeecheung does this look like anything to you? I'm not familiar with the VM module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to look into this a bit, but the test is written for someone with deep understanding of what is happening, and I unfortunately don't. And there is no stack-trace to follow.

// createContext() with null referrer also resolves to cwd.
{
const options = {
importModuleDynamically: USE_MAIN_CONTEXT_DEFAULT_LOADER,
};
const ctx = createContext({}, options);
const s = new Script('Promise.resolve("import(\'./message.mjs\')").then(eval)', {
importModuleDynamically: common.mustNotCall(),
});
await assert.rejects(s.runInContext(ctx), { code: 'ERR_MODULE_NOT_FOUND' });
}

If there is a bug being papered over, I'm pretty sure it is not caused by this change (but is perhaps uncovered by it). In which case, it should be fixed in a different PR; and since it already exists, it shouldn't be a blocker here.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

context = { __proto__: context, source };
}

if (format == null) {
// Now that we have the source for the module, run `defaultGetFormat` to detect its format.
format = await defaultGetFormat(urlInstance, context);
format = defaultGetFormat(urlInstance, context);

if (format === 'commonjs') {
Copy link
Member

@joyeecheung joyeecheung Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it might happen from the branch here and the "reset source to null and then resolve twice - the second time resolving a full path - if it's commonjs" quirk of the sync loading path. Which might mean that this sync version may not be used as-is without getting rid of that quirk or patched here somehow to not trigger that quirk.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 what makes you think it's that quirk? and why is it only exposed when going through that specific VM scenario?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because when this surfaces through the hooks it's documented as a caveat https://nodejs.org/api/module.html#caveat-in-the-asynchronous-load-hook. Internally this happens too, just less observable, I think this only happens in the sync loading, and the vm test isn't expecting this quirk to surface (now that import() switches to use sync loading that hits this quirk). Have you tried actually debugging the module loading to see what happens when the error gets thrown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried actually debugging the module loading to see what happens when the error gets thrown?

Not yet because I didn't know where to look (there was no stack-trace and I've never touched VM before). But now that I know where it happens, I have a place to start.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably less related to vm because USE_MAIN_CONTEXT_DEFAULT_LOADER = use the default loader as if it's being imported from the main context. You could debug the loader paths where ERR_UNSUPPORTED_RESOLVE_REQUEST gets thrown. It doesn't have a stack trace because assert.rejects is unable to capture the async stack trace (I wanted to use normal assert.strictEqual but @aduh95 suggested to use assert.rejects and I accepted his suggestions, maybe it's worth reverting back #51244 (comment) )

Copy link
Member Author

@JakobJingleheimer JakobJingleheimer Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It comes from the shouldBeTreatedAsRelativeOrAbsolutePath branch within moduleResolve:

TypeError [ERR_UNSUPPORTED_RESOLVE_REQUEST]: Failed to resolve module specifier "./message.mjs" from "node:internal/process/task_queues": Invalid relative URL or base scheme is not hierarchical

It's trying to instantiate a URL with 'node:internal/process/task_queues' as the "base".

I think perhaps normalizeReferrerURL is not supposed to recognise 'node:internal/process/task_queues' as a valid URL, in which case referrerName would be voided to undefined. Does that sound right to you?

I added an extra check ( 7c5787a ) within normalizeReferrerURL to exclude node: prefixed referrers, and this test and all the es-module tests pass. If all the tests pass, and we find it acceptable, I think that commit should be separated into its own PR that lands before this one.

Copy link
Member

@joyeecheung joyeecheung Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like a bug, because internal/process/task_queues doesn't have import() inside, and the referrer should never be a node:-prefixed URL since we don't import() in any of our builtins (that would not work either because we don't implement the host dynamic import callback for builtins). It sounds like whatever that differs in the sync loading from the async one (maybe this "reset to null and then re-resolve + reload" behavior) might alter the request that gets send to resolve that ends up creating an invalid request. You might want to see what is the URL being resolved with that invalid base and debug to find where that invalid request is created under what circumstances and stop it from doing that.

I think if you have to land it with the bug it's better to make the special case completely about this specific URL and add a comment mentioning that the bug exists. Excluding node: completely for this merely enlarges the scope of the bug, because that's supposed to be unreachable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joyeecheung so for the time-being, are you saying I should switch back to d765ed3? (and update the comment since it's not a mystery anymore)

I think fixing that bug is out of scope of this PR (ex if this got reverted, that would also revert the bugfix)

Copy link
Member

@joyeecheung joyeecheung Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still a mystery why an unreachable condition is somehow reachable, and is likely a bug caused by reusing a path that's not yet ready/meant to be reused this way without fixups. Another route is to not reuse getSourceSync() and write a more comparable version of synchronous getSource without that differing branch.

@JakobJingleheimer JakobJingleheimer removed the request-ci-failed An error occurred while starting CI via request-ci label, and manual interventon is needed. label Apr 10, 2025
@JakobJingleheimer JakobJingleheimer added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 11, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 11, 2025
@nodejs-github-bot
Copy link
Collaborator

@aduh95 aduh95 added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. labels Apr 11, 2025
@aduh95 aduh95 removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants