Update: Docs

coder-hxl · coder-hxl · commit 7bea7716b565 · 2023-03-08T10:34:34.000+08:00
diff --git a/README.md b/README.md
@@ -2,19 +2,20 @@
 
 English | [简体中文](https://github.com/coder-hxl/x-crawl/blob/main/docs/cn.md)
 
-x-crawl is a flexible nodejs crawler library. It is used to batch crawl data, network requests and download file resources. Support crawling data asynchronously or synchronously. Since it runs on nodejs, it is friendly to JS/TS developers.
+X-Crawl is a flexible Nodejs reptile bank. Used to crawl pages, batch network requests, and download file resources in batches. There are 5 kinds of RequestConfig writing, 3 ways to obtain results, and crawl data asynchronous or synchronized mode. Run on Nodejs and be friendly to JS/TS developers.
 
 If you feel good, you can support [x-crawl repository](https://github.com/coder-hxl/x-crawl) with a Star.
 
 ## Features
 
-- Support asynchronous/synchronous way to crawl data.
-- Support Promise/Callback method to get the result.
-- Anthropomorphic request interval.
-- Crawl pages, JSON, file resources, etc. with simple configuration.
-- Polling function, timing crawling.
-- The built-in puppeteer crawls the page and uses the jsdom library to parse the page.
-- Written in TypeScript, has type hints, and provides generics.
+- Cules data for asynchronous/synchronous ways.
+- In three ways to obtain the results of the three ways of supporting Promise, Callback, and Promise + Callback.
+- RquestConfig has 5 ways of writing.
+- The anthropomorphic request interval time.
+- In a simple configuration, you can capture pages, JSON, file resources, and so on.
+- The rotation function, crawl regularly.
+- The built -in Puppeteer crawl the page and uses the JSDOM library to analyze the page, or it can also be parsed by itself.
+- Chopening with TypeScript, possessing type prompts, and providing generic types.
 
 ## Relationship with puppeteer 
 
diff --git a/docs/cn.md b/docs/cn.md
@@ -2,18 +2,19 @@
 
 [English](https://github.com/coder-hxl/x-crawl#x-crawl) | 简体中文
 
-x-crawl 是一个灵活的 nodejs 爬虫库。用来批量爬取数据、网络请求以及下载文件资源。支持采用异步或同步的方式爬取数据。因跑在 nodejs 上，所以对 JS/TS 开发者友好。
+x-crawl 是一个灵活的 nodejs 爬虫库。用来爬取页面、批量网络请求以及批量下载文件资源。有 5 种 requestConfig 的写法，3 种获取结果的写法，异步或同步模式爬取数据。跑在 nodejs 上，对 JS/TS 开发者友好。
 
 如果感觉不错，可以给 [x-crawl 存储库](https://github.com/coder-hxl/x-crawl) 点个 Star 支持一下。
 
 ## 特征
 
 - 支持 异步/同步 方式爬取数据。
-- 支持 Promise/Callback 方式获取结果。
+- 支持 Promise、Callback 以及 Promise + Callback 这 3 种方式获取结果。
+- requestConfig 拥有 5 种写法。
 - 拟人化的请求间隔时间。
 - 只需简单的配置即可抓取页面、JSON、文件资源等等。
 - 轮询功能，定时爬取。
-- 内置 puppeteer 爬取页面 ，并用采用 jsdom 库对页面解析。
+- 内置 puppeteer 爬取页面 ，并用采用 jsdom 库对页面解析，也可自行解析。
 - 使用 TypeScript 编写，拥有类型提示，提供泛型。
 
 ## 跟 puppeteer 的关系
diff --git a/package.json b/package.json
@@ -1,7 +1,7 @@
 {
   "private": true,
   "name": "x-crawl",
-  "version": "3.2.2",
+  "version": "3.2.3",
   "author": "coderHXL",
   "description": "x-crawl is a flexible nodejs crawler library. ",
   "license": "MIT",
diff --git a/publish/README.md b/publish/README.md
@@ -2,19 +2,20 @@
 
 English | [简体中文](https://github.com/coder-hxl/x-crawl/blob/main/docs/cn.md)
 
-x-crawl is a flexible nodejs crawler library. It is used to batch crawl data, network requests and download file resources. Support crawling data asynchronously or synchronously. Since it runs on nodejs, it is friendly to JS/TS developers.
+X-Crawl is a flexible Nodejs reptile bank. Used to crawl pages, batch network requests, and download file resources in batches. There are 5 kinds of RequestConfig writing, 3 ways to obtain results, and crawl data asynchronous or synchronized mode. Run on Nodejs and be friendly to JS/TS developers.
 
 If you feel good, you can support [x-crawl repository](https://github.com/coder-hxl/x-crawl) with a Star.
 
 ## Features
 
-- Support asynchronous/synchronous way to crawl data.
-- Support Promise/Callback method to get the result.
-- Anthropomorphic request interval.
-- Crawl pages, JSON, file resources, etc. with simple configuration.
-- Polling function, timing crawling.
-- The built-in puppeteer crawls the page and uses the jsdom library to parse the page.
-- Written in TypeScript, has type hints, and provides generics.
+- Cules data for asynchronous/synchronous ways.
+- In three ways to obtain the results of the three ways of supporting Promise, Callback, and Promise + Callback.
+- RquestConfig has 5 ways of writing.
+- The anthropomorphic request interval time.
+- In a simple configuration, you can capture pages, JSON, file resources, and so on.
+- The rotation function, crawl regularly.
+- The built -in Puppeteer crawl the page and uses the JSDOM library to analyze the page, or it can also be parsed by itself.
+- Chopening with TypeScript, possessing type prompts, and providing generic types.
 
 ## Relationship with puppeteer 
 
@@ -95,7 +96,6 @@ Regular crawling: Get the recommended pictures of the youtube homepage every oth
 
 ```js
 // 1.Import module ES/CJS
-import path from 'node:path'
 import xCrawl from 'x-crawl'
 
 // 2.Create a crawler instance
@@ -125,13 +125,7 @@ myXCrawl.startPolling({ d: 1 }, () => {
     })
 
     // Call the crawlFile API to crawl pictures
-    myXCrawl.crawlFile({
-      requestConfig,
-      fileConfig: { storeDir: path.resolve(__dirname, './upload') }
-    })
-      
-    // Close the browser
-    browser.close()
+    myXCrawl.crawlFile({ requestConfig, fileConfig: { storeDir: './upload' } })
   })
 })
 ```
@@ -260,7 +254,6 @@ myXCrawl.crawlData({ requestConfig }).then(res => {
 Crawl file data via [crawlFile()](#crawlFile)
 
 ```js
-import path from 'node:path'
 import xCrawl from 'x-crawl'
 
 const myXCrawl = xCrawl({ 
@@ -274,7 +267,7 @@ myXCrawl
   .crawlFile({
     requestConfig,
     fileConfig: {
-      storeDir: path.resolve(__dirname, './upload') // storage folder
+      storeDir: './upload' // storage folder
     }
   })
   .then((fileInfos) => {
@@ -299,9 +292,7 @@ myXCrawl. startPolling({ h: 2, m: 30 }, (count, stopPolling) => {
   // crawlPage/crawlData/crawlFile
   myXCrawl.crawlPage('https://xxx.com').then(res => {
     const { jsdom, browser, page } = res
-    
-    // Close the browser
-    browser.close()
+ 
   })
 })
 ```
@@ -414,7 +405,7 @@ const requestConfig = [ 'https://xxx.com/xxxx', 'https://xxx.com/xxxx', 'https:/
 myXCrawl
   .crawlFile({
     requestConfig,
-    fileConfig: { storeDir: path. resolve(__dirname, './upload') }
+    fileConfig: { storeDir: './upload' }
   })
   .then((fileInfos) => {
     console.log('Promise: ', fileInfos)
@@ -424,7 +415,7 @@ myXCrawl
 myXCrawl.crawlFile(
   {
     requestConfig,
-    fileConfig: { storeDir: path. resolve(__dirname, './upload') }
+    fileConfig: { storeDir: './upload' }
   },
   (fileInfo) => {
     console.log('Callback: ', fileInfo)
@@ -436,7 +427,7 @@ myXCrawl
   .crawlFile(
     {
       requestConfig,
-      fileConfig: { storeDir: path. resolve(__dirname, './upload') }
+      fileConfig: { storeDir: './upload' }
     },
     (fileInfo) => {
       console.log('Callback: ', fileInfo)
@@ -589,7 +580,7 @@ myXCrawl
   .crawlFile({
     requestConfig,
     fileConfig: {
-      storeDir: path.resolve(__dirname, './upload') // storage folder
+      storeDir: './upload' // storage folder
     }
   })
   .then((fileInfos) => {
diff --git a/publish/package.json b/publish/package.json
@@ -1,6 +1,6 @@
 {
   "name": "x-crawl",
-  "version": "3.2.2",
+  "version": "3.2.3",
   "author": "coderHXL",
   "description": "x-crawl is a flexible nodejs crawler library.",
   "license": "MIT",

Original file line number	Diff line number	Diff line change
`@@ -1,7 +1,7 @@`
`1`	`1`	`{`
`2`	`2`	`"private": true,`
`3`	`3`	`"name": "x-crawl",`
`4`		`- "version": "3.2.2",`
	`4`	`+ "version": "3.2.3",`
`5`	`5`	`"author": "coderHXL",`
`6`	`6`	`"description": "x-crawl is a flexible nodejs crawler library. ",`
`7`	`7`	`"license": "MIT",`
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "x-crawl",`
`3`		`- "version": "3.2.2",`
	`3`	`+ "version": "3.2.3",`
`4`	`4`	`"author": "coderHXL",`
`5`	`5`	`"description": "x-crawl is a flexible nodejs crawler library.",`
`6`	`6`	`"license": "MIT",`