Skip to content

locoholy/searxng-tor-skill

Repository files navigation

searxng-tor-skill

This skill routes agent web searches through Tor to reduce IP and query linkability. It does not make your agent anonymous.

A direct response to the gap described in Vitalik Buterin's April 2026 post about secure local LLM setups: a search skill that routes traffic through Tor, so that the websites you visit and the search engine you use can't easily link your queries into a behavioral profile.

What It Does

Tool Input Output
tor-search a search query JSON array of results from SearXNG
tor-content a URL JSON with readable plain text, extracted via Readability

With the bundled setup, all internet-facing traffic stays Tor-routed: content.js goes through a local SOCKS proxy published by Docker on 127.0.0.1:9050, and the self-hosted SearXNG instance sends its upstream search traffic through the bundled Tor container. The local hop from search.js to SEARXNG_URL=http://localhost:8080 stays on loopback.

What It Does Not Do

  • No promise of anonymity stronger than "better metadata hygiene than direct web access"
  • No JavaScript rendering
  • No browser automation
  • No crawling (depth > 1)
  • No PDF support
  • No file downloads
  • Does not route .onion addresses (use a Tor browser for that)

Quick Start

1. Install runtime dependencies

# macOS
brew install node torsocks   # torsocks is optional unless you want the Linux/BSD-style wrappers

# Debian/Ubuntu
sudo apt install nodejs npm torsocks

Also install Docker / Docker Compose. The bundled docker-compose.yml starts both SearXNG and a Tor SOCKS proxy.

./bin/tor-search and ./bin/tor-content are Linux/BSD-first wrappers around torsocks --isolate. On macOS or Windows, prefer the documented SOCKS_URL=socks5h://... path unless you have already confirmed that torsocks works correctly in your environment.

2. Start a local SearXNG instance

docker compose up -d

The bundled Tor proxy may need a short bootstrap window on first start. If an early request times out, wait 15-60 seconds and try again.

The bundled docker-compose.yml and searxng-config/settings.yml:

  • enable format=json for the local SearXNG API;
  • start a local Tor SOCKS proxy on 127.0.0.1:9050;
  • force SearXNG's outbound search traffic through the bundled Tor service at tor:9050.

If port 9050 is already in use on your machine, stop the conflicting local Tor service or change the published port and adjust SOCKS_URL / torsocks config accordingly.

Verify:

curl "http://localhost:8080/search?q=test&format=json" | head -c 200

Topology smoke-check:

docker compose logs searxng | tail -n 50

With the bundled config, SearXNG is configured with outgoing.using_tor_proxy: true. If the bundled Tor container is unavailable, queries should fail closed instead of silently searching directly.

3. Install dependencies

npm install

4. Search

For the bundled local self-hosted topology, the preferred search path is a direct local hop to SearXNG:

SEARXNG_URL=http://localhost:8080 node search.js "ethereum consensus"

5. Fetch a page

Linux/BSD wrapper path:

./bin/tor-content "https://ethereum.org/en/what-is-ethereum/"

macOS/Windows or no working torsocks:

SEARXNG_URL=http://localhost:8080 SOCKS_URL=socks5h://127.0.0.1:9050 node content.js "https://ethereum.org/en/what-is-ethereum/"

Development mode (macOS/Windows or no working torsocks)

If you want to exercise the explicit SOCKS transport path directly (or point at a remote SearXNG instance), set SOCKS_URL instead of using torsocks:

SEARXNG_URL=http://localhost:8080 SOCKS_URL=socks5h://127.0.0.1:9050 node search.js "ethereum"

This skips torsocks but still routes DNS through the proxy. Stream isolation is maintained via per-request unique credentials. It is a development fallback, not the primary production path for remote fetches.

Dry-run (no network at all)

node search.js "anything" --dry-run
node content.js "https://example.com" --dry-run

Reads from test/fixtures/ for parsing and output format validation.

CLI Reference

tor-search / search.js

search.js "query" [-n N] [--category C] [--time-range T] [--lang L] [--dry-run] [--no-wait]

  -n N          number of results (default: 5, max: 10)
  --category    general | news | science | it (default: general)
  --time-range  day | month | year
  --lang        language code (default: en-US)
  --dry-run     use fixture, no network
  --no-wait     exit 7 instead of sleeping on rate limit

Output: JSON array of { title, url, snippet, engine, category }.

tor-content / content.js

content.js "https://..." [--dry-run] [--no-wait]

Output: JSON { title, url, text } — plain text only, never raw HTML.

Transport Modes

Mode How to invoke DNS leak Per-request isolation Recommended for
local self-hosted hop SEARXNG_URL=http://localhost:8080 node search.js "query" None (loopback only) N/A Preferred search path for bundled SearXNG
torsocks --isolate ./bin/tor-content None (LD_PRELOAD hook) Per-process (one action per call) Production page-fetch path (Linux/BSD)
socks5h:// SOCKS_URL=... node content.js None (remote DNS) Per-request via unique user:pass Development (macOS/Windows)

Important:

  • The bundled self-hosted SearXNG setup routes its own outbound search traffic through the bundled Tor container; search.js talks to that local instance over loopback.
  • torsocks --isolate gives process-level isolation. One invocation = one Tor circuit.
  • Keep unrelated research topics in separate invocations.

Hard Limits (fixed in code, not configurable)

Limit Value
Request timeout 15 seconds
Response body cap (search) 512 KiB
Response body cap (content) 2 MiB
Max redirects 5
Extracted text cap 8000 characters
Max results returned 10
Min gap between invocations 1000 ms

URL Policy

content.js enforces a strict allow/deny policy before and after every redirect:

  • Allowed: http:// and https:// to public addresses
  • Blocked: localhost, 127.0.0.0/8, ::1, RFC1918 (10/8, 172.16/12, 192.168/16), link-local, ULA, multicast, .local, .internal, .onion
  • Adversarial encodings are caught: decimal IP (2130706433), hex, IPv4-mapped IPv6 (::ffff:127.0.0.1), trailing dots

Exit Codes

Code Meaning
0 Success
2 Bad arguments or missing required input
3 URL rejected by policy (SSRF attempt, disallowed scheme, blocked redirect)
4 Transport unavailable (Tor not running, SOCKS refused)
5 Upstream error (non-2xx, timeout, body too large, invalid JSON)
6 Content not parseable (unsupported charset, unreadable body)
7 Rate limit gap not elapsed (with --no-wait)

SearXNG: Self-Hosted vs Public Instances

Self-hosted Public instance
Trust Only yourself Instance operator
Tor compatibility Full Often degraded (exit nodes banned by Google/Bing)
format=json Enabled (see searxng-config/settings.yml) Often disabled (returns 403)
Recommended for Production Dev/best-effort only

A self-hosted instance is strongly recommended. See docker-compose.yml and searxng-config/settings.yml.

Threat Model

See docs/design-notes.md for full rationale. Summary:

What Tor transport reduces:

  • IP visibility to the search provider and destination sites
  • Query linkability across requests (each invocation = separate circuit)
  • DNS leakage (DNS resolves on the exit node, not your machine)
  • Direct egress from the self-hosted SearXNG instance, when using the bundled compose/settings

What Tor transport does not fix:

  • Prompt injection from hostile web pages (see SKILL.md security contract)
  • SSRF attacks (blocked by URL policy)
  • Behavioral fingerprinting based on request patterns or timing
  • Trust in the exit node operator for non-HTTPS content

JSDOM / Readability guarantees:

  • No JavaScript execution from fetched pages
  • No subresource loading (images, stylesheets, frames) — blocked by default JSDOM behavior plus BlockingLoader
  • Only article.textContent is returned — never raw HTML or innerHTML

Running Tests

npm test

No local LLM required. No Tor daemon required. No network access.

Compatibility

Designed for the pi-skills skill format. Compatible with any agent that reads SKILL.md and invokes CLI tools.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors