Skip to content

Commit 2927328

Browse files
committed
Improve context-safe web evidence retrieval
1 parent 93360ca commit 2927328

12 files changed

Lines changed: 941 additions & 340 deletions

File tree

README.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,13 @@ Suckless by design. No browser session, no curator UI, no PDF/video pipeline, no
1010

1111
Two-stage pipeline per call:
1212

13-
1. **Retrieve** via Exa / Context7 / git clone. Full payload written to disk under a `responseId`.
14-
2. **Distill** before returning:
15-
- Small payloads → deterministic compaction (no model call).
16-
- Larger payloads → your active Pi model runs as a context firewall: fixed sections, every claim cites `[S#]`, retrieved text treated as untrusted data. Output is validated; bad runs fall back to raw.
13+
1. **Retrieve** via Exa / Context7 / git clone. Raw evidence is stored out-of-band under a `responseId`; session persistence is bounded so long runs do not bloat context/history.
14+
2. **Distill/extract** before returning:
15+
- Small payloads → deterministic extractive compaction (no model call).
16+
- Larger payloads → your active Pi model runs as a context firewall over ranked snippets: fixed sections, every finding cites `[S#]`, retrieved text treated as untrusted data. Output is validated.
17+
- If model distillation is unavailable, the fallback is a bounded extractive report, not a first-N raw dump.
1718

18-
You pay one small model call to avoid pasting 50k tokens of HTML into the main context. Override the distiller with `PI_WEB_MINIMAL_DISTILL_MODEL=provider/model-id`. Set `PI_OFFLINE=1` to skip distillation entirely.
19+
You pay one small model call to avoid pasting 50k tokens of HTML into the main context. Override the distiller with `PI_WEB_MINIMAL_DISTILL_MODEL=provider/model-id`. Set `PI_OFFLINE=1` to skip model distillation and use deterministic extraction.
1920

2021
## Install
2122

@@ -39,7 +40,7 @@ Or `~/.pi/web-search.json`:
3940
| `code_search` | API/code examples |
4041
| `documentation_search` | live library docs (Context7) |
4142
| `fetch_content` | URLs + GitHub repos (shallow-cloned to `/tmp/pi-github-repos`) |
42-
| `get_search_content` | raw escape hatch by `responseId` when distillation dropped something you needed |
43+
| `get_search_content` | raw escape hatch by `responseId` with `sourceIndex`/`urlIndex`, `offset`, `section`, or `textSearch` selectors when distillation dropped something you needed |
4344

4445
## Dev
4546

docs/agent-tool-audit.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ pi-web-minimal is intentionally a retrieval + distillation package, not a browse
1414

1515
## Current verdict
1616

17-
The package follows the new core pattern: five retrieval tools, no UI workflow, adaptive context-firewall output, raw evidence stored out of context, and no broad provider fallback stack. Tiny evidence is compacted deterministically; larger evidence is model-distilled with source refs. The main context-risk remains `get_search_content`: it can rehydrate large stored content into the conversation. It returns a bounded slice by default and requires `maxCharacters` to opt into more context.
17+
The package follows the core pattern: five retrieval tools, no UI workflow, adaptive context-firewall output, raw evidence stored out of context, and no broad provider fallback stack. Visible output is extractive-first: ranked findings plus a source manifest. Tiny evidence is compacted deterministically; larger evidence is model-distilled with source refs; model failures fall back to bounded extractive findings rather than raw previews. The main context-risk remains `get_search_content`, now mitigated by a tighter default budget plus `sourceIndex`/`urlIndex`, `offset`, `section`, and `textSearch` selectors.
1818

1919
## Tool surface
2020

@@ -24,17 +24,17 @@ The package follows the new core pattern: five retrieval tools, no UI workflow,
2424
| `code_search` | Code/API examples and references | Compact or distilled source-cited brief; stores raw result by query index. |
2525
| `documentation_search` | Current library/framework docs via Context7 | Compact or distilled source-cited brief; stores raw documentation context by query index. |
2626
| `fetch_content` | Specific URL/GitHub retrieval | Compact or distilled source-cited brief for fetched URL batches; stores raw per-URL content. |
27-
| `get_search_content` | Explicit raw stored-content retrieval | Bounded raw content by default; caller chooses selector and `maxCharacters`. |
27+
| `get_search_content` | Explicit raw stored-content retrieval | Tighter bounded raw content by default; caller chooses source/url/query plus offset, heading section, text search, and `maxCharacters`. |
2828

2929
Keeping these as separate tools is deliberate: the names map to distinct agent intents and avoid a large router schema.
3030

3131
## Context-pollution notes
3232

3333
- `appendEntry()` custom entries are not sent to the LLM, so stored raw results avoid immediate context pollution.
34-
- Session files can still grow because stored content and distilled briefs are persisted for branch/reload recovery. This is acceptable for now because retrieval continuity and auditability are more valuable than an external cache, but it should be watched in eval reports.
34+
- Session persistence is deliberately bounded: full raw evidence stays in memory for the active session, while persisted custom entries are capped to preserve reload continuity without unbounded session growth.
3535
- Tool output must remain compact. Future additions should improve evidence selection, citations, selectors, filtering, pagination, or explicit retrieval rather than larger default outputs.
36-
- Tiny raw evidence should not be expanded through a model. Use deterministic compact mode unless the source is large enough to need synthesis.
37-
- Retrieved web/docs/code content is untrusted. Compact mode strips obvious instruction-like lines. Distillation prompts must separate source blocks from instructions and require source refs for substantive claims.
36+
- Tiny raw evidence should not be expanded through a model. Use deterministic extractive compact mode unless the source is large enough to need synthesis.
37+
- Retrieved web/docs/code content is untrusted. Extractive output filters obvious instruction-like lines. Distillation prompts must separate source blocks from instructions and require source refs for substantive claims.
3838

3939
## Autodiscovery notes
4040

0 commit comments

Comments
 (0)