|
1 | 1 | # pi-web-minimal |
2 | 2 |
|
3 | | -Web, code, docs, and URL fetch tools for Pi with a context firewall. |
| 3 | +Web research for Pi agents without trashing the context window. |
4 | 4 |
|
5 | | -The goal: give the agent useful evidence, not a landfill. Tools retrieve sources, store raw evidence out of context, then return a compact source-cited brief. Tiny results are compacted without a model call; larger results are distilled with Pi's model. Raw content stays available by `responseId`. |
| 5 | +LLMs research badly: a single `fetch` or search dump can blow 50k tokens of HTML, ads, and nav chrome into context, evicting the actual work. This package wraps Exa + Context7 behind tools that **store raw results out-of-band and return a short, source-cited brief**. The agent gets evidence; you keep your context budget. |
6 | 6 |
|
7 | | -No browser session. No curator UI. No video/PDF pipeline. No broad provider stack. |
| 7 | +Suckless by design. No browser session, no curator UI, no PDF/video pipeline, no provider zoo. |
8 | 8 |
|
9 | | -## Install |
| 9 | +## How |
10 | 10 |
|
11 | | -```bash |
12 | | -pi install npm:pi-web-minimal |
13 | | -``` |
| 11 | +Two-stage pipeline per call: |
14 | 12 |
|
15 | | -## Configure |
| 13 | +1. **Retrieve** via Exa / Context7 / git clone. Full payload written to disk under a `responseId`. |
| 14 | +2. **Distill** before returning: |
| 15 | + - Small payloads → deterministic compaction (no model call). |
| 16 | + - Larger payloads → your active Pi model runs as a context firewall: fixed sections, every claim cites `[S#]`, retrieved text treated as untrusted data. Output is validated; bad runs fall back to raw. |
| 17 | + |
| 18 | +You pay one small model call to avoid pasting 50k tokens of HTML into the main context. Override the distiller with `PI_WEB_MINIMAL_DISTILL_MODEL=provider/model-id`. Set `PI_OFFLINE=1` to skip distillation entirely. |
| 19 | + |
| 20 | +## Install |
16 | 21 |
|
17 | 22 | ```bash |
| 23 | +pi install npm:pi-web-minimal |
18 | 24 | export EXA_API_KEY=exa-... |
19 | 25 | export CONTEXT7_API_KEY=ctx7sk-... |
20 | | -# optional: use a different Pi-registered model for distillation |
21 | | -export PI_WEB_MINIMAL_DISTILL_MODEL=provider/model-id |
22 | 26 | ``` |
23 | 27 |
|
24 | 28 | Or `~/.pi/web-search.json`: |
25 | 29 |
|
26 | 30 | ```json |
27 | | -{ |
28 | | - "exaApiKey": "exa-...", |
29 | | - "context7ApiKey": "ctx7sk-...", |
30 | | - "distillModel": "provider/model-id" |
31 | | -} |
| 31 | +{ "exaApiKey": "...", "context7ApiKey": "...", "distillModel": "provider/model-id" } |
32 | 32 | ``` |
33 | 33 |
|
34 | | -Exa powers web/code/content fallback. Context7 powers docs. Distillation uses the active Pi model unless overridden. |
35 | | - |
36 | 34 | ## Tools |
37 | 35 |
|
38 | | -| Tool | Use it for | Default output | |
39 | | -| --- | --- | --- | |
40 | | -| `web_search` | current web/source discovery | compact/distilled source-cited brief | |
41 | | -| `fetch_content` | URLs and GitHub repos | compact/distilled source-cited brief | |
42 | | -| `code_search` | API docs, examples, debugging evidence | compact/distilled source-cited brief | |
43 | | -| `documentation_search` | current library docs via Context7 | compact/distilled source-cited brief | |
44 | | -| `get_search_content` | raw stored evidence by `responseId` | bounded raw content | |
45 | | - |
46 | | -GitHub repos are shallow-cloned to `/tmp/pi-github-repos` for direct filesystem inspection. |
47 | | - |
48 | | -## Design contract |
49 | | - |
50 | | -- Tool output must earn its place in the agent context. |
51 | | -- Raw evidence is stored, not dumped. |
52 | | -- Claims in compact/distilled output cite `[S#]` sources. |
53 | | -- Retrieved content is untrusted; source instructions are not followed. |
54 | | -- `get_search_content` is the raw audit/escape hatch. |
55 | | -- Quality is measured by agent evals: task success, context reduction, citation validity, no fallbacks, injection resistance, and avoiding redundant follow-up calls. |
| 36 | +| Tool | For | |
| 37 | +| --- | --- | |
| 38 | +| `web_search` | discover current sources | |
| 39 | +| `code_search` | API/code examples | |
| 40 | +| `documentation_search` | live library docs (Context7) | |
| 41 | +| `fetch_content` | URLs + GitHub repos (shallow-cloned to `/tmp/pi-github-repos`) | |
| 42 | +| `get_search_content` | raw escape hatch by `responseId` when distillation dropped something you needed | |
56 | 43 |
|
57 | | -See `docs/agent-tool-audit.md` for details. |
58 | | - |
59 | | -## Development |
| 44 | +## Dev |
60 | 45 |
|
61 | 46 | ```bash |
62 | | -bun install |
63 | | -bun test |
64 | | -bun run typecheck |
65 | | -bun run check |
66 | | -bun pm pack --dry-run |
67 | | -PI_OFFLINE=1 bunx --bun pi --no-extensions -e . --list-models >/tmp/pi-web-minimal-pi-load.out |
| 47 | +bun test && bun run typecheck && bun run check |
68 | 48 | ``` |
69 | 49 |
|
70 | | -Live checks: |
71 | | - |
72 | | -```bash |
73 | | -RUN_LIVE_TESTS=1 bun test live.test.ts |
74 | | -RUN_AGENT_EVAL=1 PI_EVAL_MODEL=<provider/model> bun test agent-eval.test.ts |
75 | | -``` |
| 50 | +See `AGENTS.md` for the validation gauntlet, `docs/agent-tool-audit.md` for design notes. |
0 commit comments