Skip to content

Conversation

@abhijayrajvansh
Copy link

@abhijayrajvansh abhijayrajvansh commented Sep 15, 2025

Summary

  • Adds comprehensive Firecrawl “before vs. after” documentation with photo proofs under docs/firecrawl/.
  • Explains how Firecrawl is wired in the app (toggle, API flow, enrichment, fallbacks) and when to enable it.
  • Demonstrates quality improvements on two real queries with side‑by‑side evidence and server logs.
  • Notes latency trade‑offs and provides practical operational guidance.

Docs Added

  • docs/firecrawl/before-and-after.md — detailed analysis with logs and inline references to screenshots

Before vs. After Highlights

  1. Apple September event 2025 highlights
  • Before (OFF): generic narrative; sparse feature details, fewer grounded sections.
firecrawl-off-logs firecrawl-off-less-info-ans
  • After (ON): structured sections with concrete facts only available from reading pages (e.g., iPhone Air $999, AirPods Pro 3 live translation, preorder windows), aligned citations.
firecrawl-on-better-detailed-ans firecrawl-on-logs
  1. Personal content discovery — “tell me about his blogs”
  • Before (OFF): claims “no specific information available about blogs.”
firecrawl-off-poor-context-about-blogs
  • After (ON): correctly identifies the blog and summarizes two articles with titles/dates/summaries.
firecrawl-on-better-context-about-the-blogs

How It Works (Code Map)

  • Toggle: src/components/FirecrawlToggle.tsx persists localStorage.useFirecrawl.
  • Client → Server: src/lib/hooks/useChat.tsx includes the flag in request body; API routes log useFirecrawl and env availability of FIRECRAWL_API_KEY.
  • Enrichment: src/lib/search/metaSearchAgent.ts calls Firecrawl for top SearXNG URLs; on success, passes markdown to the retriever; on failure, falls back to raw links.
  • Retrieval: src/lib/utils/documents.ts tries Firecrawl first per‑link; otherwise falls back and annotates logs ("no data, falling back …").
  • Client: src/lib/firecrawl.ts wraps @mendable/firecrawl-js.

Env requirement: set FIRECRAWL_API_KEY (kept out of VCS). The UI toggle ensures it’s opt‑in per user/session.


Performance & Trade‑offs

  • Quality: significant boost for news, product launches, and personal/blog content.
  • Latency: increased wall time from live crawling (1.3–1.8× on cold runs). Can be mitigated with caching and selective enablement.
  • Safety: when sites block or return empty, we fall back to the original links (see logs screenshot showing fallback lines).

Testing

  • Local: set FIRECRAWL_API_KEY in .env.local, enable the UI toggle, and query examples above.
  • Look for server logs containing:
    • OFF: useFirecrawl flag: false env: true
    • ON: useFirecrawl flag: true env: true and enriching SearXNG URLs with Firecrawl, followed by scraping ... lines and potential no data, falling back lines.

Screenshots (inline)

  • Apple highlights — Before vs After: see images under docs/firecrawl/ referenced above.
  • Blog discovery — Before vs After: see images under docs/firecrawl/ referenced above.

Checklist

  • No secrets committed; API key remains in env only.
  • Docs co‑located under docs/firecrawl/.
  • Feature is optional (UI toggle) and disabled by default.
  • Lint/format pass locally.

Notes for Maintainers

  • This keeps the Firecrawl usage strictly opt‑in to avoid surprising latency/costs.
  • Happy to iterate on copy, structure, or placement of the docs.
  • If you prefer to land only the docs changes, we can split the PR or rebase as requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant