Skip to content

Commit 31c61a4

Browse files
authored
curl (#2)
1 parent eed9a58 commit 31c61a4

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+5691
-12
lines changed

AGENTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ echo 'false; echo $?' | pnpm dev:exec
2020

2121
- Install packages via pnpm rather than editing package.json directly
2222
- Bias towards making new test files that are roughly logically grouped rather than letting test files gets too large. Try to stay below 300 lines. Prefer making a new file when you want to add a `describe()`
23-
- Prefer asserting the full STDOUT/STDERR output rather than using to.contain or to.not.contain
23+
- Prefer asserting the full STDOUT/STDERR output rather than using toContain or not.toContain
2424
- Always also add `comparison-tests` for major command functionality, but edge cases should always be covered in unit tests which are mush faster (`pnpm test:comparison`)
2525
- When you are unsure about bash/command behavior, create a `comparison-tests` test file to ensure compat.
2626
- `--help` does not need to pass comparison tests and should reflect actual capability

PROJECT.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -217,3 +217,25 @@ xargs — build argument lists
217217
## All before this is done
218218

219219
Woohoo
220+
221+
## Implementation phase 16: curl
222+
223+
- make a new non-standard command called html-to-markdown which uses turndown service (npm package) to turns HTML on STDIN to markdown
224+
- Lets implement curl as a wrapper around `fetch`
225+
- Start with the most common options: method, headers, etc.
226+
- `curl` should not be available by default. It should require explicit opt-in via argument to BashEnv or Sandbox.create
227+
- The optin requires an allow-list of allowed origin + oath (optional) prefixes. Only those must be accessible via `fetch`
228+
- Must also be checked on redirects, so implement redirects in user land rather that relying on following
229+
- This allow-list must be enforced at the fetch layer, not subject to parsing
230+
- Implement extensive unit tests for the allow-list matching specifically
231+
- add `dangerouslyAllowFullInternetAccess` option to bypass allow-list
232+
233+
## Implementation phase 16.1: curl part 2
234+
235+
- Make the usage statement for html-to-markdown more docs-like since the caller will not be aware of this
236+
- Write more adversarial tests against the allow-list enforcement
237+
- Write tests that check the allow-list is enforced e2e (via bash execution)
238+
- Make sure the tests have a really good mock of the underlying fetch and never actually go to the network
239+
- allow `pnpm shell` to access the internet and document it
240+
241+
## Implementation phase 17: AI SDK Tool

README.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,12 @@ console.log(finished.exitCode); // 0
103103
pnpm shell
104104
```
105105

106+
The interactive shell has full internet access enabled by default, allowing you to use `curl` to fetch data from any URL. Use `--no-network` to disable this:
107+
108+
```bash
109+
pnpm shell --no-network
110+
```
111+
106112
## Supported Commands
107113

108114
### File Operations
@@ -121,6 +127,10 @@ pnpm shell
121127

122128
`alias`, `bash`, `chmod`, `clear`, `false`, `history`, `sh`, `true`, `unalias`
123129

130+
### Network Commands
131+
132+
`curl`, `html-to-markdown`
133+
124134
All commands support `--help` for usage information.
125135

126136
## Shell Features
@@ -149,6 +159,74 @@ When created without options, BashEnv provides a Unix-like directory structure:
149159

150160
Commands can be invoked by path (e.g., `/bin/ls`) or by name.
151161

162+
## Network Access
163+
164+
Network access (and the `curl` command) is disabled by default for security. To enable it, configure the `network` option:
165+
166+
```typescript
167+
// Allow specific URLs with GET/HEAD only (safest)
168+
const env = new BashEnv({
169+
network: {
170+
allowedUrlPrefixes: [
171+
"https://api.github.com/repos/myorg/",
172+
"https://api.example.com",
173+
],
174+
},
175+
});
176+
177+
// Allow specific URLs with additional methods
178+
const env = new BashEnv({
179+
network: {
180+
allowedUrlPrefixes: ["https://api.example.com"],
181+
allowedMethods: ["GET", "HEAD", "POST"], // Default: ["GET", "HEAD"]
182+
},
183+
});
184+
185+
// Allow all URLs and methods (use with extreme caution)
186+
const env = new BashEnv({
187+
network: { dangerouslyAllowFullInternetAccess: true },
188+
});
189+
```
190+
191+
**Note:** The `curl` command only exists when network is configured. Without network configuration, `curl` returns "command not found".
192+
193+
### Allow-List Security
194+
195+
The allow-list enforces:
196+
- **Origin matching**: URLs must match the exact origin (scheme + host + port)
197+
- **Path prefix**: Only paths starting with the specified prefix are allowed
198+
- **HTTP method restrictions**: Only GET and HEAD by default (configure `allowedMethods` for more)
199+
- **Redirect protection**: Redirects to non-allowed URLs are blocked
200+
201+
```typescript
202+
// Only allow GitHub repos in myorg
203+
const env = new BashEnv({
204+
network: { allowedUrlPrefixes: ["https://api.github.com/repos/myorg/"] },
205+
});
206+
207+
// These work:
208+
// curl https://api.github.com/repos/myorg/repo1/issues
209+
// curl https://api.github.com/repos/myorg/repo2/pulls
210+
211+
// These are blocked:
212+
// curl https://api.github.com/repos/otherorg/repo
213+
// curl https://api.github.com/users/user
214+
```
215+
216+
### Using curl
217+
218+
```bash
219+
# Fetch and process data
220+
curl -s https://api.example.com/data | grep pattern
221+
222+
# Download and convert HTML to Markdown
223+
curl -s https://example.com | html-to-markdown
224+
225+
# POST JSON data
226+
curl -X POST -H "Content-Type: application/json" \
227+
-d '{"key":"value"}' https://api.example.com/endpoint
228+
```
229+
152230
## Execution Protection
153231

154232
BashEnv includes protection against infinite loops and deep recursion:

package.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,12 +37,14 @@
3737
"@biomejs/biome": "^2.3.10",
3838
"@types/node": "^25.0.3",
3939
"@types/sprintf-js": "^1.1.4",
40+
"@types/turndown": "^5.0.6",
4041
"typescript": "^5.9.3",
4142
"vitest": "^4.0.16"
4243
},
4344
"dependencies": {
4445
"diff": "^8.0.2",
4546
"minimatch": "^10.1.1",
46-
"sprintf-js": "^1.1.3"
47+
"sprintf-js": "^1.1.3",
48+
"turndown": "^7.2.2"
4749
}
4850
}

pnpm-lock.yaml

Lines changed: 20 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/BashEnv.ts

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,22 @@
99
*/
1010

1111
import type { FunctionDefNode } from "./ast/types.js";
12-
import { createLazyCommands } from "./commands/registry.js";
12+
import {
13+
createLazyCommands,
14+
createNetworkCommands,
15+
} from "./commands/registry.js";
1316
import { type IFileSystem, VirtualFs } from "./fs.js";
1417
import type { InitialFiles } from "./fs-interface.js";
1518
import {
1619
Interpreter,
1720
type InterpreterOptions,
1821
type InterpreterState,
1922
} from "./interpreter/index.js";
23+
import {
24+
createSecureFetch,
25+
type NetworkConfig,
26+
type SecureFetch,
27+
} from "./network/index.js";
2028
import { type ParseException, parse } from "./parser/parser.js";
2129
import type { Command, CommandRegistry, ExecResult } from "./types.js";
2230

@@ -33,6 +41,11 @@ export interface BashEnvOptions {
3341
maxCallDepth?: number;
3442
maxCommandCount?: number;
3543
maxLoopIterations?: number;
44+
/**
45+
* Network configuration for commands like curl.
46+
* Network access is disabled by default - you must explicitly configure allowed URLs.
47+
*/
48+
network?: NetworkConfig;
3649
}
3750

3851
export class BashEnv {
@@ -42,6 +55,7 @@ export class BashEnv {
4255
private maxCallDepth: number;
4356
private maxCommandCount: number;
4457
private maxLoopIterations: number;
58+
private secureFetch?: SecureFetch;
4559

4660
// Interpreter state (shared with interpreter instances)
4761
private state: InterpreterState;
@@ -64,6 +78,11 @@ export class BashEnv {
6478
this.maxLoopIterations =
6579
options.maxLoopIterations ?? DEFAULT_MAX_LOOP_ITERATIONS;
6680

81+
// Create secure fetch if network is configured
82+
if (options.network) {
83+
this.secureFetch = createSecureFetch(options.network);
84+
}
85+
6786
// Initialize interpreter state
6887
this.state = {
6988
env,
@@ -105,6 +124,13 @@ export class BashEnv {
105124
for (const cmd of createLazyCommands()) {
106125
this.registerCommand(cmd);
107126
}
127+
128+
// Register network commands only when network is configured
129+
if (options.network) {
130+
for (const cmd of createNetworkCommands()) {
131+
this.registerCommand(cmd);
132+
}
133+
}
108134
}
109135

110136
registerCommand(command: Command): void {
@@ -156,6 +182,7 @@ export class BashEnv {
156182
maxCommandCount: this.maxCommandCount,
157183
maxLoopIterations: this.maxLoopIterations,
158184
exec: this.exec.bind(this),
185+
fetch: this.secureFetch,
159186
};
160187

161188
const interpreter = new Interpreter(interpreterOptions, this.state);

src/cli/shell.ts

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ interface ShellOptions {
2929
cwd?: string;
3030
files?: Record<string, string>;
3131
env?: Record<string, string>;
32+
network?: boolean;
3233
}
3334

3435
class VirtualShell {
@@ -57,6 +58,11 @@ class VirtualShell {
5758
TERM: "xterm-256color",
5859
...options.env,
5960
},
61+
// Enable network access if requested (default: enabled for interactive shell)
62+
network:
63+
options.network !== false
64+
? { dangerouslyAllowFullInternetAccess: true }
65+
: undefined,
6066
});
6167

6268
// Check if stdin is a TTY (interactive mode)
@@ -172,6 +178,10 @@ ${colors.bold}Navigation & environment:${colors.reset}
172178
${colors.bold}Utilities:${colors.reset}
173179
find, tee, basename, dirname, chmod, clear, history, alias, unalias
174180
181+
${colors.bold}Network commands:${colors.reset}
182+
curl Fetch data from URLs (with full internet access)
183+
html-to-markdown Convert HTML to Markdown (use with curl)
184+
175185
${colors.bold}Supported features:${colors.reset}
176186
- Pipes: cmd1 | cmd2
177187
- Redirections: >, >>, 2>, 2>&1, <
@@ -189,6 +199,7 @@ ${colors.bold}Example commands:${colors.reset}
189199
find . -name "*.txt"
190200
ln -s target.txt link.txt
191201
awk '{print $1}' file.txt
202+
curl -s https://example.com | html-to-markdown
192203
`);
193204
}
194205

@@ -242,7 +253,7 @@ All operations run on a virtual in-memory filesystem.
242253
// CLI argument parsing
243254
function parseArgs(): ShellOptions {
244255
const args = process.argv.slice(2);
245-
const options: ShellOptions = {};
256+
const options: ShellOptions = { network: true }; // Network enabled by default
246257

247258
for (let i = 0; i < args.length; i++) {
248259
if (args[i] === "--cwd" && args[i + 1]) {
@@ -256,17 +267,26 @@ function parseArgs(): ShellOptions {
256267
console.error(`Error reading files from ${filePath}:`, error);
257268
process.exit(1);
258269
}
270+
} else if (args[i] === "--no-network") {
271+
options.network = false;
259272
} else if (args[i] === "--help" || args[i] === "-h") {
260273
console.log(`
261274
Usage: npx tsx src/cli/shell.ts [options]
262275
263276
Options:
264277
--cwd <dir> Set initial working directory (default: /home/user)
265278
--files <json> Load initial files from JSON file
279+
--no-network Disable network access (curl commands disabled)
266280
--help, -h Show this help message
267281
282+
Network Access:
283+
By default, the interactive shell has full internet access enabled,
284+
allowing curl commands to fetch data from any URL. Use --no-network
285+
to disable this for sandboxed execution.
286+
268287
Example:
269288
npx tsx src/cli/shell.ts --cwd /app --files ./my-files.json
289+
pnpm shell --no-network
270290
`);
271291
process.exit(0);
272292
}

0 commit comments

Comments
 (0)