-
Notifications
You must be signed in to change notification settings - Fork 105
Home
CoderHXL edited this page May 6, 2023
·
5 revisions
English | 简体中文
x-crawl is a flexible Node.js multifunctional crawler library. Flexible usage and numerous functions can help you quickly, safely, and stably crawl pages, interfaces, and files.
If you also like x-crawl, you can give x-crawl repository a star to support it, thank you for your support!
- 🔥 Asynchronous Synchronous - Just change the mode property to toggle asynchronous or synchronous crawling mode.
- ⚙️ Multiple purposes - It can crawl pages, crawl interfaces, crawl files and poll crawls to meet the needs of various scenarios.
- 🖋️ Flexible writing style - The same crawling API can be adapted to multiple configurations, and each configuration method is very unique.
- ⏱️ Interval Crawling - No interval, fixed interval and random interval to generate or avoid high concurrent crawling.
- 🔄 Failed Retry - Avoid crawling failure due to short-term problems, and customize the number of retries.
- ➡️ Proxy Rotation - Auto-rotate proxies with failure retry, custom error times and HTTP status codes.
- 👀 Device Fingerprinting - Zero configuration or custom configuration, avoid fingerprinting to identify and track us from different locations.
- 🚀 Priority Queue - According to the priority of a single crawling target, it can be crawled ahead of other targets.
- ☁️ Crawl SPA - Crawl SPA (Single Page Application) to generate pre-rendered content (aka "SSR" (Server Side Rendering)).
- ⚒️ Control Page - You can submit form, keyboard input, event operation, generate screenshots of the page, etc.
- 🧾 Capture Record - Capture and record crawling, and use colored strings to remind in the terminal.
- 🦾 TypeScript - Own types, implement complete types through generics.