web-infra-dev
diff --git a/‎apps/site/docs/en/docs/getting-started/_meta.json
Lines changed: 2 additions & 1 deletion b/‎apps/site/docs/en/docs/getting-started/_meta.json
Lines changed: 2 additions & 1 deletion
diff --git a/‎apps/site/docs/en/docs/getting-started/demo.md
Lines changed: 8 additions & 0 deletions b/‎apps/site/docs/en/docs/getting-started/demo.md
Lines changed: 8 additions & 0 deletions
diff --git a/‎apps/site/docs/en/docs/getting-started/introduction.mdx
Lines changed: 1 addition & 5 deletions b/‎apps/site/docs/en/docs/getting-started/introduction.mdx
Lines changed: 1 addition & 5 deletions
diff --git a/‎apps/site/docs/en/docs/getting-started/quick-start.md
Lines changed: 5 additions & 17 deletions b/‎apps/site/docs/en/docs/getting-started/quick-start.md
Lines changed: 5 additions & 17 deletions
diff --git a/‎apps/site/docs/en/docs/more/faq.md
Lines changed: 8 additions & 7 deletions b/‎apps/site/docs/en/docs/more/faq.md
Lines changed: 8 additions & 7 deletions
diff --git a/‎apps/site/docs/public/MidScene_L.mp4
-9.36 MB b/‎apps/site/docs/public/MidScene_L.mp4
-9.36 MB
diff --git a/‎apps/site/docs/public/Visualizer.gif
-797 KB b/‎apps/site/docs/public/Visualizer.gif
-797 KB
diff --git a/‎apps/site/docs/public/visualizer.jpg
349 KB b/‎apps/site/docs/public/visualizer.jpg
349 KB
diff --git a/‎apps/site/docs/zh/docs/getting-started/_meta.json
Lines changed: 2 additions & 1 deletion b/‎apps/site/docs/zh/docs/getting-started/_meta.json
Lines changed: 2 additions & 1 deletion
diff --git a/‎apps/site/docs/zh/docs/getting-started/demo.md
Lines changed: 8 additions & 0 deletions b/‎apps/site/docs/zh/docs/getting-started/demo.md
Lines changed: 8 additions & 0 deletions
diff --git a/‎apps/site/docs/zh/docs/getting-started/introduction.mdx
Lines changed: 3 additions & 7 deletions b/‎apps/site/docs/zh/docs/getting-started/introduction.mdx
Lines changed: 3 additions & 7 deletions
diff --git a/‎apps/site/docs/zh/docs/getting-started/quick-start.md
Lines changed: 5 additions & 8 deletions b/‎apps/site/docs/zh/docs/getting-started/quick-start.md
Lines changed: 5 additions & 8 deletions
diff --git a/‎apps/site/docs/zh/docs/more/faq.md
Lines changed: 8 additions & 7 deletions b/‎apps/site/docs/zh/docs/more/faq.md
Lines changed: 8 additions & 7 deletions
diff --git a/‎packages/midscene/src/types.ts
Lines changed: 5 additions & 0 deletions b/‎packages/midscene/src/types.ts
Lines changed: 5 additions & 0 deletions
@@ -1,4 +1,5 @@
 [
   "introduction",
-  "quick-start.md"
+  "quick-start.md",
+  "demo.md"
 ]
@@ -0,0 +1,8 @@
+# Demo Projects
+
+You can clone a complete demo project in this repo: https://github.com/web-infra-dev/midscene-example/
+
+There are different folders with different type of project:
+
+* [Playwright-demo](https://github.com/web-infra-dev/midscene-example/blob/main/playwright-demo)
+* [Puppeteer-demo](https://github.com/web-infra-dev/midscene-example/blob/main/puppeteer-demo)
@@ -1,9 +1,5 @@
 # Introduction
 
-<video controls>
-  <source src="/MidScene_L.mp4" type="video/mp4" />
-</video>
-
 UI automation can be frustrating, often involving a maze of *#ids*, *data-test-xxx* attributes, and *.selectors* that are difficult to maintain, especially when the page undergoes a refactor.
 
 Introducing MidScene.js, an innovative SDK designed to bring joy back to programming by simplifying automation tasks.
@@ -38,7 +34,7 @@ With our visualization tool, you can easily debug the prompt and AI response. Al
 
 You may open the [Online Visualization Tool](/visualization/index.html) to see the showcase.
 
-![](/Visualizer.gif)
+![](/visualizer.jpg)
 
 ## Flow Chart
 
 
@@ -1,6 +1,6 @@
 # Quick Start
 
-In this example, we use OpenAI GPT-4o to search headphones on ebay, and then get the result items and prices in JSON format. 
+In this example, we use OpenAI GPT-4o to search headphones on eBay, and then get the result items and prices in JSON format. 
 
 Remember to prepare an API key that is eligible for accessing OpenAI's GPT-4o before running.
 
@@ -13,14 +13,6 @@ Config the API key
 export OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
 ```
 
-Install Dependencies
-
-```bash
-npm install @midscene/webaeb --save-dev
-# for demo use
-npm install puppeteer ts-node --save-dev 
-```
-
 ## Integrate with Playwright
 
 > [Playwright.js](https://playwright.com/) is an open-source automation library developed by Microsoft, primarily designed for end-to-end testing and web scraping of web applications.
@@ -92,10 +84,11 @@ npx playwright test ./e2e/ebay-search.spec.ts
 
 ### Step 5. view test report after running
 
-Follow the instructions in the command line to server the report
+Follow the instructions in the command line to server the report. 
 
 ```bash
-
+# sample command
+npx http-server ./midscene_run/report -p 9888 -o -s
 ```
 
 ## Integrate with Puppeteer
@@ -165,7 +158,7 @@ await mid.aiQuery(
 
 ### Step 3. run
 
-Using ts-node to run, you will get the data of Headphones on ebay:
+Using ts-node to run, you will get the data of Headphones on eBay:
 
 ```bash
 # run
@@ -189,8 +182,3 @@ npx ts-node demo.ts
 After running, MidScene will generate a log dump, which is placed in `./midscene_run/report/latest.web-dump.json` by default. Then put this file into [Visualization Tool](/visualization/), and you will have a clearer understanding of the process.
 
 Click the 'Load Demo' button in the [Visualization Tool](/visualization/), you will be able to see the results of the previous code as well as some other samples.
-
-
-## Demo Projects
-
-You can clone a complete demo project in this repo: https://github.com/web-infra-dev/midscene-example/
@@ -23,16 +23,17 @@ MidScene needs a multimodal Large Language Model (LLM) to understand the UI. Cur
 
 ### About the token cost
 
-Image resolution and element numbers (i.e., a UI context size created by MidScene) form the token bill.
+Image resolution and element numbers (i.e., a UI context size created by MidScene) will affect the token bill.
 
-Here are some typical data.
+Here are some typical data with GPT-4o.
 
-|Task | Resolution | Input tokens | Output tokens | GPT-4o Price |
-|-----|------------|--------------|---------------|----------------|
-|Find the download button on the VSCode website| 1920x1080| 2011|54| $0.011|
-|Split the Github status page| 1920x1080| 3609|1020| $0.034|
+|Task | Resolution | Prompt Tokens / Price | Completion Tokens / Price |
+|-----|------------|--------------|---------------|
+|Plan the steps to search on eBay homepage| 1280x800 | 6,975 / $0.034875 |150 / $0.00225|
+|Locate the search box on the eBay homepage| 1280x800 | 8,004 / $0.04002 | 92 / $0.00138|
+|Query the information about the item in the search results| 1280x800 | 13,403 / $0.067015 | 95 / $0.001425|
 
-> The price data was calculated in June 2024.
+> The price data was calculated in August 2024.
 
 ### The automation process is running more slowly than it did before
 
 
@@ -1,4 +1,5 @@
 [
   "introduction",
-  "quick-start.md"
+  "quick-start.md",
+  "demo.md"
 ]
@@ -0,0 +1,8 @@
+# 样例项目
+
+你可以在这里 Clone 完整的样例工程项目： https://github.com/web-infra-dev/midscene-example/
+
+项目里提供了不同类型的项目集成样例：
+
+* [Playwright-demo](https://github.com/web-infra-dev/midscene-example/blob/main/playwright-demo)
+* [Puppeteer-demo](https://github.com/web-infra-dev/midscene-example/blob/main/puppeteer-demo)
@@ -1,12 +1,8 @@
 # 介绍
 
-<video controls>
-  <source src="/MidScene_L.mp4" type="video/mp4" />
-</video>
+UI 自动化太难写了。自动化脚本里到处都是选择器，比如 `#ids`、`data-test-xxx`、`.selectors`。在页面重构的时候，维护自动化脚本更将会是一场灾难。
 
-UI 自动化太难写了。自动化脚本里到处都是选择器，比如 `#ids`、`data-test-xxx`、`.selectors`。在页面重构的时候，维护自动化脚本更会会是一场灾难。
-
-我们在这里推出 MidScene.js。通过 AI 加持，它能让自动化脚本变得简单、可维护，助你重拾编码的乐趣。
+我们在这里推出 MidScene.js，助你重拾编码的乐趣。
 
 MidScene.js 采用了多模态大语言模型（LLM），能够直观地“理解”你的用户界面并执行必要的操作。你只需描述交互步骤或期望的数据格式，AI 就能为你完成任务。
 
@@ -48,7 +44,7 @@ const dataB = await agent.aiQuery('string[], 任务列表中的任务名');
 
 你可以打开 [可视化工具](/visualization/index.html) 来查看示例。
 
-![可视化工具示例](/Visualizer.gif)
+![](/visualizer.jpg)
 
 ## 流程图
 
 
@@ -1,6 +1,6 @@
 # 快速开始
 
-在这个例子中，我们将使用 OpenAI GPT-4o 在 ebay 上搜索 "耳机"，并以 JSON 格式返回商品和价格结果。
+我们用这个需求来举例：使用 OpenAI GPT-4o 在 eBay 上搜索 "耳机"，并以 JSON 格式返回商品和价格结果。
 
 在运行该示例之前，请确保您已经准备了能够调用 OpenAI GPT-4o 模型的 API key。
 
@@ -11,7 +11,7 @@
 配置 API Key
 
 ```bash
-# replace by your own
+# 更新为你自己的 Key
 export OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
 ```
 
@@ -87,10 +87,11 @@ npx playwright test ./e2e/ebay-search.spec.ts
 
 ### Step 5. 查看测试报告
 
-Follow the instructions in the command line to server the report
+根据命令行输出，执行命令，可以以此打开可视化报告
 
 ```bash
-
+# 样例
+npx http-server ./midscene_run/report -p 9888 -o -s
 ```
 
 ## 集成到 Puppeteer
@@ -186,7 +187,3 @@ npx ts-node demo.ts
 运行 MidScene 之后，系统会生成一个日志文件，默认存放在 `./midscene_run/report/latest.web-dump.json`。然后，你可以把这个文件导入 [可视化工具](/visualization/)，这样你就能更清楚地了解整个过程。
 
 在 [可视化工具](/visualization/) 中，点击 `Load Demo` 按钮，你将能够看到上方代码的运行结果以及其他的一些示例。
-
-## 完整的样例工程
-
-你可以在这里 Clone 完整的样例工程项目： https://github.com/web-infra-dev/midscene-example/
@@ -19,21 +19,22 @@ MidScene 存在一些局限性，我们仍在努力改进。
 
 ### 关于 token 成本
 
-Token 消耗分为两部分：图像分辨率和元素数量（即 MidScene 创建的 UI 上下文大小）。
+图像分辨率和元素数量（即 MidScene 创建的 UI 上下文大小）会显著影响 token 消耗。
 
 以下是一些典型数据：
 
-| 任务  | 分辨率   | 输入 token | 输出 token | GPT-4o 价格 |
-|-------|----------|----------|----------|--------------|
-| 在 VSCode 网站上找到下载按钮 | 1920x1080  | 2011     | 54       | $0.011       |
-| 拆分 Github 状态页面          | 1920x1080  | 3609     | 1020     | $0.034       |
+|任务 | 分辨率 | Prompt Tokens / 价格 | Completion Tokens / 价格 |
+|-----|------------|--------------|---------------|
+|拆解（Plan）执行搜索的步骤| 1280x800| 6,975 / $0.034875 |150 / $0.00225|
+|定位（Locate）搜索框| 1280x800 | 8,004 / $0.04002 | 92 / $0.00138 |
+|提取（Query）商品信息| 1280x800| 13,403 / $0.067015 | 95 / $0.001425 |
 
-> 这些价格数据是 2024 年 6 月计算所得
+> 这些价格数据测算于 2024 年 8 月
 
 ### 脚本运行偏慢？
 
 由于 MidScene.js 每次进行规划（Planning）和查询（Query）时都会调用 AI，其运行耗时可能比传统 Playwright 用例增加 3 到 10 倍，比如从 5 秒变成 20秒。目前，这一点仍无法避免。但随着大型语言模型（LLM）的进步，未来性能可能会有所改善。
 
 尽管运行时间较长，MidScene 在实际应用中依然表现出色。它独特的开发体验会让代码库易于维护。我们相信，集成了 MidScene 的自动化脚本能够显著提升项目迭代效率，覆盖更多场景，提高整体生产力。
 
-简而言之，虽然偏慢，但这些时间投入一定都是值得的。
+简而言之，虽然偏慢，但这些投入一定都是值得的。
@@ -209,6 +209,10 @@ export interface ExecutorContext {
   element?: BaseElement | null;
 }
 
+export interface TaskCacheInfo {
+  hit: boolean;
+}
+
 export interface ExecutionTaskApply<
   Type extends ExecutionTaskType = any,
   TaskParam = any,
@@ -228,6 +232,7 @@ export interface ExecutionTaskReturn<TaskOutput = unknown, TaskLog = unknown> {
   output?: TaskOutput;
   log?: TaskLog;
   recorder?: ExecutionRecorderItem[];
+  cache?: TaskCacheInfo;
 }
 
 export type ExecutionTask<E extends ExecutionTaskApply<any, any, any> = ExecutionTaskApply<any, any, any>> =
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,5 @@`
`1`	`1`	`[`
`2`	`2`	`"introduction",`
`3`		`- "quick-start.md"`
	`3`	`+ "quick-start.md",`
	`4`	`+ "demo.md"`
`4`	`5`	`]`