Skip to content

Commit ea8e2a6

Browse files
committed
fix: address review findings from PR #1719
- Remove issue number refs (#1717, #1651) from comments per CLAUDE.md - Trim multi-line JSDoc on extractNodeOutputEnvVars to 2-line comment - Remove regex comment that explained WHAT not WHY - Trim escapedForBash trailing comment - Add collision guard warn log when two node IDs map to the same env var - Spread config.envVars into until_bash subprocess for parity with executeBashNode - Update variables.md: replace stale "auto shell-quoted" description with env-var mechanism - Update script-nodes.md: fix stale contrast with bash node auto-quoting - Add CHANGELOG entry for #1717 fix - Add integration tests: executeBashNode and until_bash env var wiring via spy - Add empty-string output unit test for extractNodeOutputEnvVars
1 parent 3301463 commit ea8e2a6

5 files changed

Lines changed: 129 additions & 17 deletions

File tree

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1111

1212
- MCP server support for Codex workflow nodes via the shared `loadMcpConfig` module — pass `mcp: <path>` on a Codex node and the config is translated to Codex's `mcp_servers` overrides at runtime. MCP client errors are surfaced to the workflow author as `system` chunks when MCP is explicitly configured for the node (#1459).
1313

14+
### Fixed
15+
16+
- **Bash node large-output corruption**: whole-output `$nodeId.output` refs in `bash:` and `until_bash` loop conditions now travel via environment variables instead of inline shell-quoting. Inputs of any size (including 40 KB+ LLM outputs) are handled correctly. Field-access refs (`$nodeId.output.field`) are unaffected. Closes #1717.
17+
1418
## [0.3.12] - 2026-05-14
1519

1620
Orchestrator prompt-cache fix, SDK termination edge cases, marketplace expansion, and broad workflow fixes.

packages/docs-web/src/content/docs/guides/script-nodes.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -199,9 +199,10 @@ for field access in `when:` conditions and prompt substitution.
199199
### Variable Substitution in Scripts
200200

201201
Variables are substituted into the `script` text **as raw strings, without
202-
shell quoting** — unlike `bash:` nodes, where `$nodeId.output` values are
203-
auto-quoted. Treat substituted values as untrusted input and parse them with
204-
language features, not by interpolating into shell syntax.
202+
shell quoting** — unlike `bash:` nodes, where `$nodeId.output` whole-output
203+
values are passed safely via environment variables. Treat substituted values
204+
as untrusted input and parse them with language features, not by interpolating
205+
into shell syntax.
205206

206207
:::caution[Avoid String.raw with `$nodeId.output`]
207208
The pattern `` String.raw`$nodeId.output` `` looks safe but fails silently when

packages/docs-web/src/content/docs/reference/variables.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ In DAG workflows, nodes can reference the output of any completed upstream node.
6767

6868
### Shell Quoting in `bash:` vs `script:`
6969

70-
`$nodeId.output` values are **auto shell-quoted** (single-quoted, with embedded `'` escaped) when substituted into `bash:` scripts, so the value is always safe to embed in a shell command. They are **not** shell-quoted when substituted into `script:` bodies — the raw value is embedded as-is. For script nodes, treat substituted values as untrusted input and parse them with language features (e.g. `JSON.parse`), not by interpolating into shell syntax.
70+
In `bash:` nodes, `$nodeId.output` (whole-output) is passed into the script via a dedicated environment variable (`_ARCHON_NODE_<ID>_OUTPUT`) and referenced as `"${_ARCHON_NODE_<ID>_OUTPUT}"`. This avoids size limits that inline argument quoting imposes on large LLM outputs. Field-access refs (`$nodeId.output.field`) are still substituted inline as shell-quoted strings. Values are **not** shell-quoted when substituted into `script:` bodies — the raw value is embedded as-is. For script nodes, treat substituted values as untrusted input and parse them with language features (e.g. `JSON.parse`), not by interpolating into shell syntax.
7171

7272
### Example
7373

packages/workflows/src/dag-executor.test.ts

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -891,6 +891,13 @@ describe('extractNodeOutputEnvVars', () => {
891891
expect(script).toBe('echo $a.output.field_name');
892892
expect(envVars).toEqual({});
893893
});
894+
895+
it('handles empty string node output', () => {
896+
const outputs = new Map([['a', makeOutput('completed', '')]]);
897+
const { script, envVars } = extractNodeOutputEnvVars('echo $a.output', outputs);
898+
expect(script).toBe('echo "${_ARCHON_NODE_A_OUTPUT}"');
899+
expect(envVars).toEqual({ _ARCHON_NODE_A_OUTPUT: '' });
900+
});
894901
});
895902

896903
describe('substituteNodeOutputRefs -- structuredOutput preference', () => {
@@ -1612,6 +1619,110 @@ describe('executeDagWorkflow -- bash nodes', () => {
16121619
execSpy.mockRestore();
16131620
}
16141621
});
1622+
1623+
it('passes $nodeId.output value via _ARCHON_NODE_* env var in executeBashNode', async () => {
1624+
const execSpy = spyOn(git, 'execFileAsync').mockResolvedValue({ stdout: 'ok\n', stderr: '' });
1625+
try {
1626+
const mockDeps = createMockDeps();
1627+
const platform = createMockPlatform();
1628+
const workflowRun = makeWorkflowRun('bash-env-var-run-id');
1629+
const upstreamOutput = 'upstream value with $special "chars"';
1630+
1631+
const nodes: DagNode[] = [
1632+
{ id: 'upstream', bash: 'echo upstream' },
1633+
{ id: 'consumer', bash: 'echo $upstream.output', depends_on: ['upstream'] },
1634+
];
1635+
1636+
await executeDagWorkflow(
1637+
mockDeps,
1638+
platform,
1639+
'conv-env-var',
1640+
testDir,
1641+
{ name: 'env-var-test', nodes },
1642+
workflowRun,
1643+
'claude',
1644+
undefined,
1645+
join(testDir, 'artifacts'),
1646+
join(testDir, 'logs'),
1647+
'main',
1648+
'docs/',
1649+
minimalConfig,
1650+
undefined,
1651+
undefined,
1652+
new Map([['upstream', upstreamOutput]])
1653+
);
1654+
1655+
const consumerCall = execSpy.mock.calls.find(call => {
1656+
const script = (call[1] as string[])[1] ?? '';
1657+
return script.includes('_ARCHON_NODE_UPSTREAM_OUTPUT');
1658+
});
1659+
expect(consumerCall).toBeDefined();
1660+
const envArg = (consumerCall![2] as { env: NodeJS.ProcessEnv }).env;
1661+
expect(envArg?._ARCHON_NODE_UPSTREAM_OUTPUT).toBe(upstreamOutput);
1662+
} finally {
1663+
execSpy.mockRestore();
1664+
}
1665+
});
1666+
1667+
it('passes $nodeId.output via _ARCHON_NODE_* env var in until_bash', async () => {
1668+
// until_bash returning exit 0 signals loop completion; mock it to exit 0
1669+
const execSpy = spyOn(git, 'execFileAsync').mockResolvedValue({ stdout: '', stderr: '' });
1670+
try {
1671+
mockSendQueryDag.mockImplementation(function* () {
1672+
yield { type: 'assistant', content: 'Done. <promise>COMPLETE</promise>' };
1673+
yield { type: 'result', sessionId: 'loop-until-bash-session' };
1674+
});
1675+
1676+
const mockDeps = createMockDeps();
1677+
const platform = createMockPlatform();
1678+
const workflowRun = makeWorkflowRun('until-bash-env-run-id');
1679+
const upstreamOutput = 'done';
1680+
1681+
// upstream is a bash node whose output is pre-seeded via priorCompletedNodes
1682+
const nodes: DagNode[] = [
1683+
{ id: 'upstream', bash: 'echo upstream' },
1684+
{
1685+
id: 'check',
1686+
loop: {
1687+
prompt: 'Do work.',
1688+
until_bash: 'test $upstream.output = "done"',
1689+
until: 'COMPLETE',
1690+
max_iterations: 1,
1691+
},
1692+
depends_on: ['upstream'],
1693+
},
1694+
];
1695+
1696+
await executeDagWorkflow(
1697+
mockDeps,
1698+
platform,
1699+
'conv-until-bash-env',
1700+
testDir,
1701+
{ name: 'until-bash-env-test', nodes },
1702+
workflowRun,
1703+
'claude',
1704+
undefined,
1705+
join(testDir, 'artifacts'),
1706+
join(testDir, 'logs'),
1707+
'main',
1708+
'docs/',
1709+
minimalConfig,
1710+
undefined,
1711+
undefined,
1712+
new Map([['upstream', upstreamOutput]])
1713+
);
1714+
1715+
const bashCall = execSpy.mock.calls.find(call => {
1716+
const script = (call[1] as string[])[1] ?? '';
1717+
return script.includes('_ARCHON_NODE_UPSTREAM_OUTPUT');
1718+
});
1719+
expect(bashCall).toBeDefined();
1720+
const envArg = (bashCall![2] as { env: NodeJS.ProcessEnv }).env;
1721+
expect(envArg?._ARCHON_NODE_UPSTREAM_OUTPUT).toBe(upstreamOutput);
1722+
} finally {
1723+
execSpy.mockRestore();
1724+
}
1725+
});
16151726
});
16161727

16171728
describe('executeDagWorkflow -- output_format structured output', () => {

packages/workflows/src/dag-executor.ts

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -233,22 +233,14 @@ function shellQuote(value: string): string {
233233
return `'${value.replaceAll("'", "'\\''")}'`;
234234
}
235235

236-
/**
237-
* For bash node scripts: replaces $nodeId.output references (whole-output, no .field)
238-
* with env-var expansions ("${_ARCHON_NODE_<ID>_OUTPUT}") and returns the env vars
239-
* to inject into the subprocess environment.
240-
*
241-
* This avoids inline shell-quoting of arbitrarily large LLM outputs — the same
242-
* pattern used for $USER_MESSAGE / $ARGUMENTS / $LOOP_* since #1651.
243-
* Field-access refs ($nodeId.output.field) are left for substituteNodeOutputRefs.
244-
*/
236+
// Replaces whole-output $nodeId.output refs with env-var expansions to avoid
237+
// inline shell-quoting of large LLM outputs. Field refs left for substituteNodeOutputRefs.
245238
export function extractNodeOutputEnvVars(
246239
script: string,
247240
nodeOutputs: Map<string, NodeOutput>
248241
): { script: string; envVars: Record<string, string> } {
249242
const envVars: Record<string, string> = {};
250243
const modifiedScript = script.replace(
251-
// Match $nodeId.output NOT followed by .fieldname (negative lookahead)
252244
/\$([a-zA-Z_][a-zA-Z0-9_-]*)\.output(?!\.[a-zA-Z_])/g,
253245
(match, nodeId: string) => {
254246
const nodeOutput = nodeOutputs.get(nodeId);
@@ -257,6 +249,9 @@ export function extractNodeOutputEnvVars(
257249
return match;
258250
}
259251
const envVarName = `_ARCHON_NODE_${nodeId.replace(/-/g, '_').toUpperCase()}_OUTPUT`;
252+
if (envVarName in envVars) {
253+
getLog().warn({ nodeId, envVarName }, 'dag.node_output_env_var_collision');
254+
}
260255
envVars[envVarName] = nodeOutput.output;
261256
// Double-quoted expansion: prevents word-splitting without inline quoting
262257
return `"\${${envVarName}}"`;
@@ -1335,7 +1330,7 @@ async function executeBashNode(
13351330
{ shellSafe: true }
13361331
);
13371332
// Pass $nodeId.output (whole output) via env vars to avoid inline shell-quoting of
1338-
// large values — see #1717. Field-access refs ($nodeId.output.field) remain inline.
1333+
// large values. Field-access refs ($nodeId.output.field) remain inline.
13391334
const { script: scriptWithEnvVarRefs, envVars: nodeOutputEnvVars } = extractNodeOutputEnvVars(
13401335
substitutedScript,
13411336
nodeOutputs
@@ -2178,13 +2173,13 @@ async function executeLoopNode(
21782173
undefined,
21792174
{ shellSafe: true }
21802175
);
2181-
// Pass $nodeId.output via env vars to avoid inline shell-quoting of large values — see #1717.
2176+
// Pass $nodeId.output via env vars to avoid inline shell-quoting of large values.
21822177
const { script: bashWithEnvVarRefs, envVars: untilBashNodeEnvVars } =
21832178
extractNodeOutputEnvVars(bashPrompt, nodeOutputs);
21842179
const substitutedBash = substituteNodeOutputRefs(
21852180
bashWithEnvVarRefs,
21862181
nodeOutputs,
2187-
true // escapedForBash — field refs only remain
2182+
true // escapedForBash
21882183
);
21892184
await execFileAsync('bash', ['-c', substitutedBash], {
21902185
cwd,
@@ -2200,6 +2195,7 @@ async function executeLoopNode(
22002195
EXTERNAL_CONTEXT: issueContext ?? '',
22012196
ISSUE_CONTEXT: issueContext ?? '',
22022197
...untilBashNodeEnvVars,
2198+
...(config.envVars ?? {}),
22032199
},
22042200
});
22052201
bashComplete = true; // exit 0 = complete

0 commit comments

Comments
 (0)