Skip to content

Latest commit

 

History

History
25 lines (23 loc) · 6.05 KB

File metadata and controls

25 lines (23 loc) · 6.05 KB

Feature Comparison

Elastic also offers the cloud-based Elastic Crawler, with similar features. The following table compares the features of Open Crawler to those of Elastic Crawler (cloud).

Open Crawler Elastic Crawler (cloud)
Interface CLI GUI (Kibana)
Hosting Self-hosted Elastic Cloud or self-hosted Enterprise Search
State management Stateless Elasticsearch indices
Compatible with Elasticsearch Serverless Yes No
Unrestricted index naming Yes No
Indexing using _bulk API Yes No
Purge crawls Yes Yes
Ingest pipelines Yes Yes
Binary content extraction Yes Yes
Crawl rules — allow/disallow specified URLs Yes Yes
Extraction rules — extraction using CSS, XPath, and URL selectors Yes Yes
Crawler directives — robots.txt, sitemaps, robots meta tags, canonical URLs, nofollow links Yes Yes
Scheduling Yes Yes
Extraction using data attributes and meta tags Yes Yes
Full HTML extraction Yes Yes
Event logging in Elasticsearch Yes Yes
Duplicate content handling No Yes
Crawl result history and metadata No Yes