Skip to content

feat: add fastCRW URL parser engine#20

Open
us wants to merge 1 commit into
wisupai:mainfrom
us:feat/add-fastcrw
Open

feat: add fastCRW URL parser engine#20
us wants to merge 1 commit into
wisupai:mainfrom
us:feat/add-fastcrw

Conversation

@us

@us us commented Jun 14, 2026

Copy link
Copy Markdown

What

Adds fastCRW as a web scrape/search provider, alongside the existing Firecrawl integration — additive, mirrors the Firecrawl wiring (Firecrawl untouched).

Why

fastCRW is a fully open-source web engine (AGPL, single ~8 MB Rust binary) that outperforms Firecrawl on Firecrawl's own benchmark dataset and runs 100% locally with no cloud dependency.

Runs 100% locally — anti-bot and JS rendering included in the open core.
Firecrawl's OSS self-host falls back to plain fetch/Playwright because its stealth engine (fire-engine) is gated behind a cloud-only flag — so a self-hosted Firecrawl can't reliably reach protected or JS-heavy sites. fastCRW ships Cloudflare JS-challenge handling, UA rotation, SPA rendering, BYO-proxy + rotation, and an HTTP→headless→proxy fallback ladder in the open core. One binary, no cloud, no hidden upsell.

Faster and higher recall on Firecrawl's own benchmark.
On Firecrawl's public benchmark dataset: truth-recall 63.74 % vs 56.04 %, and faster median latency (p50 ~1.9 s vs ~2.3 s). ~6 MB RAM at idle.

On web search: crw is built on top of SearXNG, not an alternative to it.
SearXNG is the metasearch aggregator underneath; crw adds a quality layer on top: query expansion (multi-variant rewrite), content-aware reranking (re-scoring by fetched content instead of SearXNG's content-blind ordering), and category routing (research queries fan out to arxiv / semantic scholar / google scholar, code queries to GitHub). The result is SearXNG's breadth plus a measurable accuracy layer — all open-source (AGPL) and self-hostable with configurable engines, no bare passthrough.

Flat, predictable pricing. 1 credit = 1 page; no 4× stealth surcharge, no billed-on-failure. Free tier at https://fastcrw.com/dashboard; self-host base URL supported via CRW_API_KEY + CRW_BASE_URL.


Because fastCRW is API-compatible with Firecrawl, the integration is a small additive diff — the Firecrawl provider is untouched. I maintain the integration and can provide free credits to evaluate. Happy to adjust to your conventions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant