Skip to content

Latest commit

 

History

History
196 lines (144 loc) · 9.61 KB

File metadata and controls

196 lines (144 loc) · 9.61 KB

JSON Mode and Tool Calls · Prompt Assembly

🧭 Quick Return to Map

You are in a sub-page of PromptAssembly.
To reorient, go back here:

Think of this page as a desk within a ward.
If you need the full triage and all prescriptions, return to the Emergency Room lobby.

A field guide to keep JSON mode stable and tool calls safe. Use this page to clamp the schema, stop looped tools, and keep outputs auditable across providers.

Open these first

When to use this page

  • JSON mode returns invalid or partial objects.
  • Tool calls arrive with missing or extra fields.
  • The model mixes tool output with prose.
  • Tools loop or stall without timeouts.
  • Role text bleeds into user turns and corrupts parsing.
  • Outputs differ between seeds for the same inputs.

Acceptance targets

  • JSON parse success rate ≥ 0.99 across three paraphrases.
  • Tool call validity ≥ 0.98 by schema check.
  • Zero side effects on failed parse.
  • λ remains convergent across three paraphrases and two seeds.
  • ΔS(question, retrieved) ≤ 0.45 when the answer cites corpus evidence.

Map symptoms to structural fixes

  • Invalid JSON or mixed prose. → Lock format with a contract and a validator. See Data Contracts, Logic Collapse

  • Tool arguments drift or contain free text. → Use strict argument schemas and echo the schema in every step. See Prompt Injection

  • Tools loop or wait on each other. → Add timeouts and split memories by namespace and revision. See Multi-Agent Problems

  • Answers flip between runs for the same input. → Reorder headers and clamp variance with BBAM. Verify with Retrieval Traceability

  • Citations missing or point to the wrong snippet. → Enforce cite then explain. See Retrieval Playbook

Fix in 60 seconds

  1. Clamp the JSON contract Define a single object shape. Disallow extra fields. Reject prose outside the object.

  2. Echo the tool schema In every step the model must restate the allowed tools and their argument shapes. Reject any call that does not match.

  3. Add timeouts and retries Each tool has timeout_ms and at most N retries with exponential backoff. Abort on chain length overrun.

  4. Observability probes Log parse success, tool validity, ΔS, and λ. Trip a circuit if parse rate drops under 0.98 in a five minute window.

  5. No side effects before validation Validate JSON and tool calls first. Only then commit external writes.

Copy paste blocks

A. JSON answer contract

You must output a single JSON object. No Markdown. No code fences. No commentary.

Schema:
{
  "answer": "string",
  "citations": [{"source_url": "string", "snippet_id": "string"}],
  "tool_calls": [{"name": "tool_name", "args": { ... }}],
  "metrics": {"lambda_state": "->|<-|<>|x", "delta_s": 0.00}
}

Rules:
- Do not include fields that are not in the schema.
- Strings must not contain unescaped newlines.
- If you cannot satisfy the schema, return:
  {"answer": "", "citations": [], "tool_calls": [], "metrics": {"lambda_state":"x","delta_s":1.0}}

B. Tool protocol guard

Allowed tools:
1) "web_fetch": args = {"url": "string"}
2) "vector_search": args = {"query": "string", "k": 5}
3) "write_kv": args = {"key":"string","value":"string","ttl_sec": 600}

Contract:
- Each tool call must match the exact args shape.
- Never put narrative text into args.
- Max tool calls in one turn: 3
- Per call timeout_ms: 15000
- Retries: up to 2 with capped backoff

C. Validator stub

Step 1: parse JSON strictly. If parse fails, stop and return a fix tip.
Step 2: check extra fields. If any, reject.
Step 3: validate each tool call against the schema list.
Step 4: only after validation, run tools in order.
Step 5: log {parse_ok, tool_valid, delta_s, lambda_state}.

D. Role header clamp

System:
- Policies and schema live here only. User turns contain tasks and questions only.
- Cite then explain. Refuse to answer without citations when the task requires evidence.
- Echo the current tool schema before any tool call.

User:
- Provides task and question.

Deep diagnostics

  • Three paraphrase test Ask the same question three ways. Parse rate and tool validity should remain ≥ 0.98. If not, tighten the schema or reduce optional fields.

  • Anchor triangulation Compare ΔS to the expected anchor section and to a decoy. If ΔS is close for both, rework chunking and rebuild index. See Retrieval Playbook

  • Chain length audit If tool chains exceed safe length and entropy rises, split the plan and reconnect with a BBCR bridge. See Logic Collapse

Eval gates before ship

  • JSON parse success ≥ 0.99 on a 50 case set.
  • Tool validity ≥ 0.98 with negative cases included.
  • Coverage ≥ 0.70 and ΔS ≤ 0.45 on evidence tasks.
  • λ convergent across two seeds.
  • Live probes green for one hour with zero side effects on rejects.

🔗 Quick-Start Downloads (60 sec)

Tool Link 3-Step Setup
WFGY 1.0 PDF Engine Paper 1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS) TXTOS.txt 1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

Explore More

Layer Page What it’s for
⭐ Proof WFGY Recognition Map External citations, integrations, and ecosystem proof
⚙️ Engine WFGY 1.0 Original PDF tension engine and early logic sketch (legacy reference)
⚙️ Engine WFGY 2.0 Production tension kernel for RAG and agent systems
⚙️ Engine WFGY 3.0 TXT based Singularity tension engine (131 S class set)
🗺️ Map Problem Map 1.0 Flagship 16 problem RAG failure taxonomy and fix map
🗺️ Map Problem Map 2.0 Global Debug Card for RAG and agent pipeline diagnosis
🗺️ Map Problem Map 3.0 Global AI troubleshooting atlas and failure pattern map
🧰 App TXT OS .txt semantic OS with fast bootstrap
🧰 App Blah Blah Blah Abstract and paradox Q&A built on TXT OS
🧰 App Blur Blur Blur Text to image generation with semantic control
🏡 Onboarding Starter Village Guided entry point for new users

If this repository helped, starring it improves discovery so more builders can find the docs and tools.
GitHub Repo stars