|
| 1 | +# Web Search |
| 2 | + |
| 3 | +Search the web for: $ARGUMENTS |
| 4 | + |
| 5 | +## Command |
| 6 | + |
| 7 | +Choose a short, descriptive filename based on the query (e.g., `ai-chip-news`, `crispr-off-target`). Use lowercase with hyphens, no spaces. |
| 8 | + |
| 9 | +```bash |
| 10 | +uv run --with exa-py python "$SKILL_PATH/scripts/exa_search.py" "$ARGUMENTS" \ |
| 11 | + --text --highlights \ |
| 12 | + -o "$FILENAME.json" |
| 13 | +``` |
| 14 | + |
| 15 | +`$SKILL_PATH` is the path to this skill directory. The `-o` flag saves the full results to a JSON file so follow-up questions can reuse them without re-querying. |
| 16 | + |
| 17 | +**Search type selection** — `--type` controls retrieval mode: |
| 18 | + |
| 19 | +| Mode | When to use | |
| 20 | +|---|---| |
| 21 | +| `auto` (default) | Exa's general-purpose search. Use this unless you have a reason not to. | |
| 22 | +| `fast` | Lowest latency. Use for simple lookups where speed matters more than nuance. | |
| 23 | +| `deep` | Slowest but highest quality. Use for hard, conceptual, or exhaustive research queries where recall matters more than latency. | |
| 24 | + |
| 25 | +**Content modes** — add any combination: |
| 26 | + |
| 27 | +- `--text` returns full-text content per result |
| 28 | +- `--highlights` returns the most relevant passages (good signal-to-noise, lower token cost than full text) |
| 29 | + |
| 30 | +Default to `--highlights` for broad searches (cheaper, more skimmable). Add `--text` only when you need to quote or extract in detail. |
| 31 | + |
| 32 | +**Filtering options** — Exa supports rich filtering via the SDK: |
| 33 | + |
| 34 | +- `--start-published-date YYYY-MM-DD` / `--end-published-date YYYY-MM-DD` for time-sensitive queries |
| 35 | +- `--include-domains domain1.com,domain2.com` to restrict to an allowlist |
| 36 | +- `--exclude-domains spam.com,low-quality.com` to drop a blocklist |
| 37 | +- `--category "research paper"` to bias toward scholarly content (also: `company`, `news`, `github`, `personal site`, `financial report`, `people`) |
| 38 | +- `--user-location US` for locale-specific results |
| 39 | + |
| 40 | +## Academic source strategy |
| 41 | + |
| 42 | +For scientific or technical queries, Exa has two strong levers: |
| 43 | + |
| 44 | +### 1. Use `--category "research paper"` |
| 45 | + |
| 46 | +```bash |
| 47 | +uv run --with exa-py python "$SKILL_PATH/scripts/exa_search.py" "$ARGUMENTS" \ |
| 48 | + --category "research paper" \ |
| 49 | + --text --highlights \ |
| 50 | + -o "$FILENAME-academic.json" |
| 51 | +``` |
| 52 | + |
| 53 | +This biases retrieval toward papers indexed as scholarly content (journals, preprint servers, conference proceedings) rather than blogs or news coverage. |
| 54 | + |
| 55 | +### 2. Restrict to scholarly domains |
| 56 | + |
| 57 | +For stricter academic filtering, combine the category with an explicit domain allowlist: |
| 58 | + |
| 59 | +```bash |
| 60 | +uv run --with exa-py python "$SKILL_PATH/scripts/exa_search.py" "$ARGUMENTS" \ |
| 61 | + --category "research paper" \ |
| 62 | + --include-domains "arxiv.org,biorxiv.org,medrxiv.org,pubmed.ncbi.nlm.nih.gov,nature.com,science.org" \ |
| 63 | + --text --highlights \ |
| 64 | + -o "$FILENAME-academic.json" |
| 65 | +``` |
| 66 | + |
| 67 | +### Two-pass pattern for comprehensive coverage |
| 68 | + |
| 69 | +Run **both** an academic-focused search and an unrestricted one, then merge with academic sources first: |
| 70 | + |
| 71 | +1. Academic pass: `--category "research paper"` with the scholarly domain allowlist above. |
| 72 | +2. General pass: the standard command without `--category` or `--include-domains`, to catch relevant non-academic sources (news coverage, lab blogs, institutional pages). |
| 73 | + |
| 74 | +Merge results, leading with academic sources. If the query is clearly non-scientific, skip the academic pass. |
| 75 | + |
| 76 | +**When to use the two-search pattern:** Any query involving scientific claims, medical information, research findings, technical mechanisms, statistical data, or anything where primary literature would be more reliable than secondary reporting. |
| 77 | + |
| 78 | +## Parsing results |
| 79 | + |
| 80 | +Parse the JSON output. Each result includes: |
| 81 | + |
| 82 | +- `title`, `url`, `published_date`, `author` |
| 83 | +- `score` — Exa's relevance score for the query |
| 84 | +- `text` (if `--text`), `highlights` + `highlight_scores` (if `--highlights`) |
| 85 | + |
| 86 | +**Snippet fallback** — any combination of content fields may be present. Cascade through them: prefer `highlights` (tight, pre-selected passages), fall back to a truncated slice of `text`. Never assume exactly one is present. |
| 87 | + |
| 88 | +## Response format |
| 89 | + |
| 90 | +**CRITICAL: Every claim must have an inline citation.** Use markdown links pulling only from the JSON output. Never invent or guess URLs. |
| 91 | + |
| 92 | +For academic sources, use author-year citation style where metadata is available: |
| 93 | +- Academic: [Smith et al., 2025](url) or [Smith & Jones, 2024](url) |
| 94 | +- Non-academic: [Source Title](url) |
| 95 | + |
| 96 | +Synthesize a response that: |
| 97 | +- Leads with findings from peer-reviewed or preprint sources when available |
| 98 | +- Clearly distinguishes between claims backed by primary research vs. secondary reporting |
| 99 | +- Includes specific facts, names, numbers, dates |
| 100 | +- Cites every fact inline — do not leave any claim uncited |
| 101 | +- Organizes by theme if multiple topics |
| 102 | +- Notes the evidence quality (e.g., "a randomized controlled trial found..." vs. "a blog post reports...") |
| 103 | + |
| 104 | +**End with a Sources section** listing every URL referenced, grouped by type: |
| 105 | + |
| 106 | +``` |
| 107 | +Sources: |
| 108 | +
|
| 109 | +Academic / Peer-reviewed: |
| 110 | +- [Smith et al., 2025 — Title of Paper](https://doi.org/...) (Nature, 2025) |
| 111 | +- [Jones & Lee, 2024 — Title of Paper](https://arxiv.org/...) (arXiv preprint) |
| 112 | +
|
| 113 | +Other: |
| 114 | +- [Source Title](https://example.com/article) (Feb 2026) |
| 115 | +``` |
| 116 | + |
| 117 | +This Sources section is mandatory. Do not omit it. If no academic sources were found, note that and explain why (e.g., the topic is too recent, not yet studied, or inherently non-academic). |
| 118 | + |
| 119 | +After the Sources section, mention the output file path (`$FILENAME.json`) so the user knows it's available for follow-up questions. |
0 commit comments