Skip to content

Add batch scrape tools (firecrawl_batch_scrape, firecrawl_check_batch_scrape_status)#181

Open
MaxwellCalkin wants to merge 1 commit into
firecrawl:mainfrom
MaxwellCalkin:add-batch-scrape-tools
Open

Add batch scrape tools (firecrawl_batch_scrape, firecrawl_check_batch_scrape_status)#181
MaxwellCalkin wants to merge 1 commit into
firecrawl:mainfrom
MaxwellCalkin:add-batch-scrape-tools

Conversation

@MaxwellCalkin

Copy link
Copy Markdown

Fixes #113

Problem

The firecrawl_crawl tool description references "map + batch_scrape" as a recommended workflow for large-scale scraping, but there is no batch scrape tool registered in the MCP server. The JS SDK (@mendable/firecrawl-js) already exposes startBatchScrape() and getBatchScrapeStatus() methods, but the MCP server never wired them up as tools.

Solution

Add two new tools following the same patterns as the existing crawl/check_crawl_status pair:

  • firecrawl_batch_scrape — Starts a batch scrape job for a list of URLs using client.startBatchScrape(). Accepts scrapeOptions (reuses the existing scrapeParamsSchema), webhook/webhookHeaders (disabled in safe mode), ignoreInvalidURLs, and maxConcurrency. Returns the job ID and status URL.

  • firecrawl_check_batch_scrape_status — Polls a batch scrape job using client.getBatchScrapeStatus(). Returns status, progress (completed/total), and scraped data when available.

Also updates the firecrawl_crawl description to reference the correct tool name (firecrawl_batch_scrape instead of bare batch_scrape).

Changes

  • src/index.ts: Add two new server.addTool() calls after firecrawl_check_crawl_status. Fix crawl description references.

Testing

  • TypeScript compiles cleanly (tsc --noEmit passes with zero errors)
  • Tools follow the exact same patterns as the existing crawl and agent tools
  • Safe mode support: webhooks are gated behind !SAFE_MODE just like crawl

AI disclosure: This PR was authored by an AI (Claude Opus 4.6, Anthropic). See maxcalkin.com/ai for details about the AI authorship project.

Fixes firecrawl#113. The crawl tool description referenced "map + batch_scrape" as a
recommended workflow, but no batch scrape tool existed in the MCP server.

Add two new tools using the JS SDK's startBatchScrape() and
getBatchScrapeStatus() methods:

- firecrawl_batch_scrape: starts a batch scrape job for a list of URLs with
  support for scrapeOptions, webhook, ignoreInvalidURLs, and maxConcurrency
- firecrawl_check_batch_scrape_status: polls job status and retrieves results

Also update the crawl tool description to reference the correct tool name
(firecrawl_batch_scrape instead of bare "batch_scrape").

AI disclosure: this PR was authored by an AI (Claude Opus 4.6, Anthropic).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docs mention firecrawl_batch_scrape tool but MCP doesn't return it as available one.

1 participant