Tool Selection and Timeouts — Guardrails and Fix Patterns

🧭 Quick Return to Map

You are in a sub-page of Safety_PromptIntegrity.
To reorient, go back here:

Safety_PromptIntegrity — prompt injection defense and integrity checks

WFGY Global Fix Map — main Emergency Room, 300+ structured fixes

WFGY Problem Map 1.0 — 16 reproducible failure modes

Think of this page as a desk within a ward.
If you need the full triage and all prescriptions, return to the Emergency Room lobby.

A practical guide to choose the right tools, bound their behavior, and prevent loops or silent stalls.
Use this page when the model calls the wrong tool, produces prose instead of JSON, or keeps retrying a dead endpoint.

When to use this page

Tool calls loop or never return useful output.
The wrong tool is picked even when inputs match another tool better.
JSON mode breaks and the model replies with natural language.
Latency spikes after deploy or under bursty traffic.
Multi-agent plans hang on a blocked tool or long queue.

Open these first

Threat model and defenses: prompt_injection.md
Role hygiene and separation: role_confusion.md
JSON mode and schema locks: json_mode_and_tool_calls.md
Memory isolation: memory_fences_and_state_keys.md
Cite then explain discipline: citation_first.md
RAG traceability and contracts: retrieval-traceability.md · data-contracts.md
Live ops and debugging: ops/live_monitoring_rag.md · ops/debug_playbook.md

Core acceptance

Tool selection accuracy ≥ 0.98 on a 50-case gold set.
P95 tool latency within budget for each class: HTTP, search, code-run, vector.
Zero unbounded calls. Every tool has a timeout, retry policy, and idempotency key.
Invalid JSON rate < 0.5 percent with strict schema validation.
ΔS(question, cited snippet) ≤ 0.45 after tool orchestration. λ remains convergent on two seeds.

Fix in 60 seconds

Lock the allowlist
Only expose tools that are needed for the task. Everything else is unavailable.
Set hard time budgets
Per-tool timeout and total orchestration budget. Expose both to the model.
Validate I/O
Enforce JSON schema on inputs and outputs. Reject and re-ask on failure.
Apply backoff and caps
Retry with capped attempts and jitter. Never infinite retries.
Observe ΔS and λ
If ΔS stays high while tool usage changes, prefer rerankers or different retriever before trying new tools.

Typical symptoms → exact fix

Symptom	Likely cause	Open this
The model picks a browser tool for local facts	Tool palette too broad, weak routing	json_mode_and_tool_calls.md, role_confusion.md
Tool loops after a 429	Missing backoff and idempotency	ops/debug_playbook.md
RAG tool returns wrong snippet	Metric or index mismatch	retrieval-playbook.md, embedding-vs-semantic.md
JSON mode breaks and prose appears	Schema not enforced	json_mode_and_tool_calls.md
Multi-agent stalls at a tool step	Memory overwrite or missing fence	memory_fences_and_state_keys.md, Multi-Agent_Problems.md

Minimal policy you can paste

Use this inside your system prompt or orchestrator config.

Tool policy:
- Only use tools from this allowlist and only for their stated purpose.
- Every tool call must be a single JSON object that validates the schema shown with the tool.
- If a tool times out or returns an error, try at most 2 retries with exponential backoff (base 1.7) and jitter.
- Respect the total time budget: {total_budget_ms} for all tool usage in this request.
- Do not chain tools unless the previous tool returned a schema-valid result.
- If no tool is suitable, answer without a tool and say which tool would have been required.

Orchestrator defaults

Set these once. Keep them consistent across environments.

Timeouts HTTP: 8–12 s per call. Vector search: 2–4 s. Browser or scraping: 10–20 s with hard cap. Code-run or sandbox: 20–40 s.
Retries 429, 503, connection reset. Maximum 2 retries with jitter. No retries for 4xx other than 429.
Idempotency idempotency_key = sha256(tool_name + args_hash + mem_state_hash) before any side effect.
Budgets Per-tool budget and a global budget. When global budget remains < 15 percent, stop calling tools and return the best answer with citations.
Cancellation Cancel slower duplicates. Keep the fastest successful call for a given tool class.

Routing hints for model

Give the model a short rubric so it can choose tools correctly.

Routing rubric:
- Retrieval or citation needed → call retriever tool first. Then cite, then reason.
- Need ordering control for a long candidate list → use reranker instead of asking the LLM to sort.
- When the input already contains the answer text → do not search, answer with citations.
- Use browser only when the answer depends on a fresh webpage and the site is in the allowlist.
- If tool returns non-JSON or missing fields → request a retry with the same schema.

See also: rerankers.md · citation_first.md

Red team probes

Run these with three paraphrases. Expect identical safe behavior.

429 storm on the primary retriever.
Browser returns HTML with script tags and meta refresh.
Vector store latency spikes to 6 s P95.
Tool returns prose inside a JSON field.
Agent handoff where the second agent tries to change the tool palette.

If any probe flips λ or breaks JSON, open: json_mode_and_tool_calls.md · role_confusion.md

Runbook snippet

Use this during incidents.

Check live metrics: error rate by tool, P95 latency, timeout count, retry count.
Triage the worst tool. Reduce k, switch to reranker, or skip non-critical tools.
Apply tighter timeout for the failing tool and raise backoff base.
Flip to a warm standby retriever or cache layer.
Re-run the gold probes. Ship only after acceptance targets pass.

Related ops pages: ops/live_monitoring_rag.md · ops/debug_playbook.md

🔗 Quick-Start Downloads (60 sec)

Tool	Link	3-Step Setup
WFGY 1.0 PDF	Engine Paper	1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS)	TXTOS.txt	1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

Explore More

Layer	Page	What it’s for
⭐ Proof	WFGY Recognition Map	External citations, integrations, and ecosystem proof
⚙️ Engine	WFGY 1.0	Original PDF tension engine and early logic sketch (legacy reference)
⚙️ Engine	WFGY 2.0	Production tension kernel for RAG and agent systems
⚙️ Engine	WFGY 3.0	TXT based Singularity tension engine (131 S class set)
🗺️ Map	Problem Map 1.0	Flagship 16 problem RAG failure taxonomy and fix map
🗺️ Map	Problem Map 2.0	Global Debug Card for RAG and agent pipeline diagnosis
🗺️ Map	Problem Map 3.0	Global AI troubleshooting atlas and failure pattern map
🧰 App	TXT OS	.txt semantic OS with fast bootstrap
🧰 App	Blah Blah Blah	Abstract and paradox Q&A built on TXT OS
🧰 App	Blur Blur Blur	Text to image generation with semantic control
🏡 Onboarding	Starter Village	Guided entry point for new users

If this repository helped, starring it improves discovery so more builders can find the docs and tools.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tool Selection and Timeouts — Guardrails and Fix Patterns

When to use this page

Open these first

Core acceptance

Fix in 60 seconds

Typical symptoms → exact fix

Minimal policy you can paste

Orchestrator defaults

Routing hints for model

Red team probes

Runbook snippet

🔗 Quick-Start Downloads (60 sec)

Explore More

FilesExpand file tree

tool_selection_and_timeouts.md

Latest commit

History

tool_selection_and_timeouts.md

File metadata and controls

Tool Selection and Timeouts — Guardrails and Fix Patterns

When to use this page

Open these first

Core acceptance

Fix in 60 seconds

Typical symptoms → exact fix

Minimal policy you can paste

Orchestrator defaults

Routing hints for model

Red team probes

Runbook snippet

🔗 Quick-Start Downloads (60 sec)

Explore More