You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I'm working on a tool that builds untrusted user-submitted projects. Third party developers submit pull requests to a manifest repository and a custom tool clones their repository and builds their project on GitHub Actions. (The build output of the user-submitted project is distributed to end users, so this lets us review projects for potentially malicious code and establish a chain of trust for the built artifacts.)
Right now, this system works by building the user-submitted project in a Docker container with full network access. To minimize potential abuse, I ideally want the user-submitted project to be built without network access, and I am currently rewriting my CI to do that. To accomplish this, I'm creating a custom store and using pnpm fetch to fetch the project's dependencies. That store is then mounted into a container without network access, where the project is built in a sandbox, and the output files are copied out of the container.
I'll use the terminology "host" and "sandbox" to refer to the portions with network access and without network access respectively. Let's assume Node.js 22 and pnpm 10 are both present in the host and sandbox already, and the user-submitted repository was already downloaded on the host. The process looks something like:
On the host, create a custom store by making an empty directory, and set the NPM_CONFIG_STORE_DIR environment variable to point to the custom store.
This can also be configured with .npmrc, but I prefer the environment variable for more short-term usage.
On the host, run pnpm fetch in the user-submitted repository, to fetch packages into the custom store.
Create the sandbox, mounting the custom store and user-submitted repository into the container.
In my project, this starts a new Docker container. For testing, you can pretend that there's no sandbox at all (just run the commands like normal), but keep in mind the container doesn't have any network access.
The store itself does not contain symlinks, it's just a raw index of files, so mounting it into the Docker container should be safe (assuming they use the same Node/pnpm version, which is always true for me.)
In the sandbox, set the NPM_CONFIG_STORE_DIR environment variable to point to the mounted store.
In the sandbox, run pnpm install --frozen-lockfile --offline in the mounted user-submitted repository, to install the dependencies from the mounted store.
After this, I do my own project-specific build steps, and copy the build output out of the Docker container. The details of these aren't relevant to this discussion, though.
I have a very basic implementation of this idea already working, and it seems to work well for simple projects. However, there's a lot of annoying bugs I'm working around, and some projects don't build at all. This is a list of issues I've run into while working on this, I'm not sure how much of these are my mistake or actual pnpm bugs, but I'm happy to make issue(s) if needed. Thanks!
pnpm fetch always creates node_modules
pnpm fetch creates the node_modules directory when ran from the host. This causes issues when mounting the user-submitted repository into the sandbox, as the host's node_modules will be present, and it will contain broken symlinks. I've tried passing --lockfile-only and --resolution-only, but those don't seem to stop node_modules from being created. Deleting the host-created node_modules before running the sandbox seems to work, but I would rather tell pnpm fetch to not create a node_modules at all. It's a lot of wasted disk IO making a node_modules folder just to instantly delete it.
If the host-created node_modules is not deleted before running the sandbox, when running pnpm install in the sandbox, pnpm will detect the broken node_modules and prompt to recreate it. I pass --config.confirmModulesPurge=false to automatically confirm this, but I think --force would also work.
TL;DR: Is there any way to tell pnpm fetch to not create a node_modules directory? Is this behavior intended or a pnpm bug?
Running build scripts after pnpm fetch
pnpm fetch, by default, runs the build scripts of dependencies. This is a huge issue for me, since pnpm fetch is being ran on the host, and malicious actors could submit a hand-crafted custom repository to run arbitrary code on the host.
I can pass --ignore-scripts when running pnpm fetch, but pnpm outputs the warning "The git-hosted package fetched from (url) has to be built but the build scripts were ignored". When running pnpm install in the sandbox, the build scripts aren't reran (as it's assumed they were already ran when being placed into the store, I guess?), so any dependency that requires the build scripts to run will fail.
This is really important for me to fix, since most user-submitted projects are using a package of ours that's written in TypeScript. Referencing this package from npm works fine (as the uploaded artifact already contains the build output), but if a user references that package from a github: URI (e.g. using a prerelease version), the build will fail because that package wasn't built.
TL;DR: Is there any way to tell pnpm install to run the build scripts of dependencies after using pnpm fetch --ignore-scripts? Is this behavior intended or a pnpm bug?
pnpm fetches packageManager even when offline
When managePackageManagerVersions is true (which is the default), and a project specifies the packageManager field in its package.json, pnpm will fetch and pin that exact version.
This still happens even when running pnpm install --offline, which seems like a bug. For my setup, this results in pnpm locking up for up to a minute with no output (until it times out fetching from npm and throws an error). I originally thought my build logging was broken because of this, which made me very confused. I worked around this by just passing --config.managePackageManagerVersions=false to pnpm install.
It also now occurs to me that I ran into this issue while developing this tool myself! The tool I'm developing is built using pnpm, and it also specifies packageManager in the package.json. The example Docker images that I used as a base have the entrypoint pnpm start, which will cause pnpm to fetch that specific package manager version. This is usually fine since Corepack already pinned that exact version, but Corepack is being removed from Node.js soon, so I replaced it with just npm i -g pnpm@10.
When I ran my container with --net=none, it would execute pnpm start and try (and fail) to download that exact pnpm version, because I wasn't pinning the exact version I was using in the Dockerfile. I solved this by switching to node dist as my container's entrypoint, but I should go pin that version properly, so it's not fetching pnpm on every startup. Whoops!
TL;DR: This seems like a pnpm bug that should be fixed.
pnpm fetch and pnpm install using mismatched store versions
I had this happen when building a project that pinned pnpm 9 in the packageManager field. Since I skipped pinning the exact version (see the previous section), I was still using pnpm 10, and I presume pnpm chose to use store version v3 for compatibility. (Yes, I know mismatching major pnpm versions is a bad idea, and I will go improve this later, but I'm focused on getting my code working first.) You can spot this by reading Content-addressable store is at: (store path)/v3 in the pnpm fetch logs.
When running pnpm install --offline, pnpm will not detect the existing packages, and it will throw ERR_PNPM_NO_OFFLINE_TARBALL ("A package is missing from the store but cannot download it in offline mode. The missing package may be downloaded from (url)"). My assumption is that pnpm install is looking under store version v10, which doesn't have any dependencies in it, so pnpm will think that the dependencies are completely missing from the store (I didn't confirm this, and I don't know much about the pnpm store, this is just an educated guess).
Removing the packageManager field from the package.json makes pnpm fetch use store version v10, but I don't want to edit the user's repository, so this isn't a viable solution for me. I tried looking for a flag to ignore the packageManager field, or to force a specific store version, but I couldn't find anything.
TL;DR: Is there any way to tell pnpm fetch to use a specific store version? Or, is there any way to tell pnpm to completely ignore the packageManager field? Is this behavior intended or a pnpm bug?
pnpm fetch doesn't handle patchedDependencies
If a project uses patchedDependencies, pnpm install will throw ERR_PNPM_LOCKFILE_CONFIG_MISMATCH ("Cannot proceed with the frozen installation. The current 'patchedDependencies' configuration doesn't match the value found in the lockfile").
In my specific example, the patchedDependenciesis synchronized in the lockfile, and the lockfile seems up to date (pnpm install --lockfile-only doesn't change anything). The error tells me to pass --no-frozen-lockfile, but this doesn't work for my use case, since I'm relying on the lockfile to fetch specific packages.
Removing patchedDependencies from the project's package.json makes the error go away, but obviously the dependencies won't be patched, and as I said earlier I want to avoid modifying the provided package.json.
Very basic reproduction steps (using a real repository, though) are available here. (Keep in mind this project also sets packageManager, so you have to work around that too.)
TL;DR: This seems like a pnpm bug that should be fixed.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Some basic context
Hi! I'm working on a tool that builds untrusted user-submitted projects. Third party developers submit pull requests to a manifest repository and a custom tool clones their repository and builds their project on GitHub Actions. (The build output of the user-submitted project is distributed to end users, so this lets us review projects for potentially malicious code and establish a chain of trust for the built artifacts.)
Right now, this system works by building the user-submitted project in a Docker container with full network access. To minimize potential abuse, I ideally want the user-submitted project to be built without network access, and I am currently rewriting my CI to do that. To accomplish this, I'm creating a custom store and using pnpm fetch to fetch the project's dependencies. That store is then mounted into a container without network access, where the project is built in a sandbox, and the output files are copied out of the container.
I'll use the terminology "host" and "sandbox" to refer to the portions with network access and without network access respectively. Let's assume Node.js 22 and pnpm 10 are both present in the host and sandbox already, and the user-submitted repository was already downloaded on the host. The process looks something like:
NPM_CONFIG_STORE_DIR
environment variable to point to the custom store..npmrc
, but I prefer the environment variable for more short-term usage.pnpm fetch
in the user-submitted repository, to fetch packages into the custom store.NPM_CONFIG_STORE_DIR
environment variable to point to the mounted store.pnpm install --frozen-lockfile --offline
in the mounted user-submitted repository, to install the dependencies from the mounted store.After this, I do my own project-specific build steps, and copy the build output out of the Docker container. The details of these aren't relevant to this discussion, though.
I have a very basic implementation of this idea already working, and it seems to work well for simple projects. However, there's a lot of annoying bugs I'm working around, and some projects don't build at all. This is a list of issues I've run into while working on this, I'm not sure how much of these are my mistake or actual pnpm bugs, but I'm happy to make issue(s) if needed. Thanks!
pnpm fetch
always createsnode_modules
pnpm fetch
creates thenode_modules
directory when ran from the host. This causes issues when mounting the user-submitted repository into the sandbox, as the host'snode_modules
will be present, and it will contain broken symlinks. I've tried passing--lockfile-only
and--resolution-only
, but those don't seem to stopnode_modules
from being created. Deleting the host-creatednode_modules
before running the sandbox seems to work, but I would rather tellpnpm fetch
to not create anode_modules
at all. It's a lot of wasted disk IO making anode_modules
folder just to instantly delete it.If the host-created
node_modules
is not deleted before running the sandbox, when runningpnpm install
in the sandbox, pnpm will detect the brokennode_modules
and prompt to recreate it. I pass--config.confirmModulesPurge=false
to automatically confirm this, but I think--force
would also work.TL;DR: Is there any way to tell
pnpm fetch
to not create anode_modules
directory? Is this behavior intended or a pnpm bug?Running build scripts after
pnpm fetch
pnpm fetch
, by default, runs the build scripts of dependencies. This is a huge issue for me, sincepnpm fetch
is being ran on the host, and malicious actors could submit a hand-crafted custom repository to run arbitrary code on the host.I can pass
--ignore-scripts
when runningpnpm fetch
, but pnpm outputs the warning "The git-hosted package fetched from (url) has to be built but the build scripts were ignored". When runningpnpm install
in the sandbox, the build scripts aren't reran (as it's assumed they were already ran when being placed into the store, I guess?), so any dependency that requires the build scripts to run will fail.This is really important for me to fix, since most user-submitted projects are using a package of ours that's written in TypeScript. Referencing this package from npm works fine (as the uploaded artifact already contains the build output), but if a user references that package from a
github:
URI (e.g. using a prerelease version), the build will fail because that package wasn't built.TL;DR: Is there any way to tell
pnpm install
to run the build scripts of dependencies after usingpnpm fetch --ignore-scripts
? Is this behavior intended or a pnpm bug?pnpm fetches
packageManager
even when offlineWhen managePackageManagerVersions is true (which is the default), and a project specifies the
packageManager
field in itspackage.json
, pnpm will fetch and pin that exact version.This still happens even when running
pnpm install --offline
, which seems like a bug. For my setup, this results in pnpm locking up for up to a minute with no output (until it times out fetching from npm and throws an error). I originally thought my build logging was broken because of this, which made me very confused. I worked around this by just passing--config.managePackageManagerVersions=false
topnpm install
.It also now occurs to me that I ran into this issue while developing this tool myself! The tool I'm developing is built using pnpm, and it also specifies
packageManager
in thepackage.json
. The example Docker images that I used as a base have the entrypointpnpm start
, which will cause pnpm to fetch that specific package manager version. This is usually fine since Corepack already pinned that exact version, but Corepack is being removed from Node.js soon, so I replaced it with justnpm i -g pnpm@10
.When I ran my container with
--net=none
, it would executepnpm start
and try (and fail) to download that exact pnpm version, because I wasn't pinning the exact version I was using in the Dockerfile. I solved this by switching tonode dist
as my container's entrypoint, but I should go pin that version properly, so it's not fetching pnpm on every startup. Whoops!TL;DR: This seems like a pnpm bug that should be fixed.
pnpm fetch
andpnpm install
using mismatched store versionsI had this happen when building a project that pinned pnpm 9 in the
packageManager
field. Since I skipped pinning the exact version (see the previous section), I was still using pnpm 10, and I presume pnpm chose to use store version v3 for compatibility. (Yes, I know mismatching major pnpm versions is a bad idea, and I will go improve this later, but I'm focused on getting my code working first.) You can spot this by readingContent-addressable store is at: (store path)/v3
in thepnpm fetch
logs.When running
pnpm install --offline
, pnpm will not detect the existing packages, and it will throwERR_PNPM_NO_OFFLINE_TARBALL
("A package is missing from the store but cannot download it in offline mode. The missing package may be downloaded from (url)"). My assumption is thatpnpm install
is looking under store version v10, which doesn't have any dependencies in it, so pnpm will think that the dependencies are completely missing from the store (I didn't confirm this, and I don't know much about the pnpm store, this is just an educated guess).Removing the
packageManager
field from thepackage.json
makespnpm fetch
use store version v10, but I don't want to edit the user's repository, so this isn't a viable solution for me. I tried looking for a flag to ignore thepackageManager
field, or to force a specific store version, but I couldn't find anything.TL;DR: Is there any way to tell
pnpm fetch
to use a specific store version? Or, is there any way to tell pnpm to completely ignore thepackageManager
field? Is this behavior intended or a pnpm bug?pnpm fetch
doesn't handlepatchedDependencies
If a project uses
patchedDependencies
,pnpm install
will throwERR_PNPM_LOCKFILE_CONFIG_MISMATCH
("Cannot proceed with the frozen installation. The current 'patchedDependencies' configuration doesn't match the value found in the lockfile").In my specific example, the
patchedDependencies
is synchronized in the lockfile, and the lockfile seems up to date (pnpm install --lockfile-only
doesn't change anything). The error tells me to pass--no-frozen-lockfile
, but this doesn't work for my use case, since I'm relying on the lockfile to fetch specific packages.Removing
patchedDependencies
from the project'spackage.json
makes the error go away, but obviously the dependencies won't be patched, and as I said earlier I want to avoid modifying the providedpackage.json
.Very basic reproduction steps (using a real repository, though) are available here. (Keep in mind this project also sets
packageManager
, so you have to work around that too.)TL;DR: This seems like a pnpm bug that should be fixed.
Beta Was this translation helpful? Give feedback.
All reactions