Skip to content

Commit efa4263

Browse files
yuyutaotaozhoushaw
andauthored
feat: export yaml runner in javascipt (#368)
--------- Co-authored-by: zhouxiao.shaw <[email protected]>
1 parent 01b2461 commit efa4263

File tree

8 files changed

+235
-28
lines changed

8 files changed

+235
-28
lines changed

apps/site/docs/en/API.md

+32-10
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ And also, puppeteer agent has an extra option:
2121

2222
These are the main methods on all kinds of agents in Midscene.
2323

24-
> In the following documentation, you may see functions called with the `mid.` prefix. If you use destructuring in Playwright, like `async ({ ai, aiQuery }) => { /* ... */}`, you can call the functions without this prefix. It's just a matter of syntax.
24+
> In the following documentation, you may see functions called with the `agent.` prefix. If you use destructuring in Playwright, like `async ({ ai, aiQuery }) => { /* ... */}`, you can call the functions without this prefix. It's just a matter of syntax.
2525
2626
### `.aiAction(steps: string)` or `.ai(steps: string)` - Interact with the page
2727

@@ -32,11 +32,11 @@ You can use `.aiAction` to perform a series of actions. It accepts a `steps: str
3232
These are some good samples:
3333

3434
```typescript
35-
await mid.aiAction('Enter "Learn JS today" in the task box, then press Enter to create');
36-
await mid.aiAction('Move your mouse over the second item in the task list and click the Delete button to the right of the second task');
35+
await agent.aiAction('Enter "Learn JS today" in the task box, then press Enter to create');
36+
await agent.aiAction('Move your mouse over the second item in the task list and click the Delete button to the right of the second task');
3737

3838
// use `.ai` shortcut
39-
await mid.ai('Click the "completed" status button below the task list');
39+
await agent.ai('Click the "completed" status button below the task list');
4040
```
4141

4242
Steps should always be clearly and thoroughly described. A very brief prompt like 'Tweet "Hello World"' will result in unstable performance and a high likelihood of failure.
@@ -62,7 +62,7 @@ You can extract customized data from the UI. Provided that the multi-modal AI ca
6262
For example, to parse detailed information from page:
6363

6464
```typescript
65-
const dataA = await mid.aiQuery({
65+
const dataA = await agent.aiQuery({
6666
time: 'date and time, string',
6767
userInfo: 'user info, {name: string}',
6868
tableFields: 'field names of table, string[]',
@@ -74,18 +74,18 @@ You can also describe the expected return value format as a plain string:
7474

7575
```typescript
7676
// dataB will be a string array
77-
const dataB = await mid.aiQuery('string[], task names in the list');
77+
const dataB = await agent.aiQuery('string[], task names in the list');
7878

7979
// dataC will be an array with objects
80-
const dataC = await mid.aiQuery('{name: string, age: string}[], Data Record in the table');
80+
const dataC = await agent.aiQuery('{name: string, age: string}[], Data Record in the table');
8181
```
8282

8383
### `.aiAssert(assertion: string, errorMsg?: string)` - do an assertion
8484

8585
`.aiAssert` works just like the normal `assert` method, except that the condition is a prompt string written in natural language. Midscene will call AI to determine if the `assertion` is true. If the condition is not met, an error will be thrown containing `errorMsg` and a detailed reason generated by AI.
8686

8787
```typescript
88-
await mid.aiAssert('The price of "Sauce Labs Onesie" is 7.99');
88+
await agent.aiAssert('The price of "Sauce Labs Onesie" is 7.99');
8989
```
9090

9191
:::tip
@@ -94,7 +94,7 @@ Assertions are usually a very important part of your script. To prevent the poss
9494
For example, to replace the previous assertion,
9595

9696
```typescript
97-
const items = await mid.aiQuery(
97+
const items = await agent.aiQuery(
9898
'"{name: string, price: number}[], return item name and price of each item',
9999
);
100100
const onesieItem = items.find(item => item.name === 'Sauce Labs Onesie');
@@ -110,7 +110,29 @@ expect(onesieItem.price).toBe(7.99);
110110
When considering the time required for the AI service, `.aiWaitFor` may not be very efficient. Using a simple `sleep` method might be a useful alternative to `waitFor`.
111111

112112
```typescript
113-
await mid.aiWaitFor("there is at least one headphone item on page");
113+
await agent.aiWaitFor("there is at least one headphone item on page");
114+
```
115+
116+
### `.runYaml(yamlScriptContent: string)` - run a yaml script
117+
118+
`.runYaml` will run the `tasks` part of the yaml script and return the result of all the `.aiQuery` calls (if any). The `target` part of the yaml script will be ignored in this function.
119+
120+
To ignore some errors while running, you can set the `continueOnError` option in the yaml script. For more details about the yaml script schema, please refer to [Automate with Scripts in YAML](./automate-with-scripts-in-yaml).
121+
122+
```typescript
123+
const { result } = await agent.runYaml(`
124+
tasks:
125+
- name: search weather
126+
flow:
127+
- ai: input 'weather today' in input box, click search button
128+
- sleep: 3000
129+
130+
- name: query weather
131+
flow:
132+
- aiQuery: "the result shows the weather info, {description: string}"
133+
name: weather
134+
`);
135+
console.log(result);
114136
```
115137

116138
## Properties

apps/site/docs/en/automate-with-scripts-in-yaml.mdx

+22
Original file line numberDiff line numberDiff line change
@@ -206,10 +206,32 @@ You can use the environment variable in the `.yaml` file like this:
206206
```
207207

208208
## Use bridge mode
209+
209210
By using bridge mode, you can utilize YAML scripts to automate the web browser on your desktop. This is particularly useful if you want to reuse cookies, plugins, and page states, or if you want to manually interact with automation scripts.
210211

211212
See [Bridge Mode by Chrome Extension](./bridge-mode-by-chrome-extension) for more details.
212213

214+
## Run yaml script with javascript
215+
216+
You can also run a yaml script with javascript by using the `runYaml` method of the Midscene agent. Only the `tasks` part of the yaml script will be executed.
217+
218+
```typescript
219+
const { result } = await agent.runYaml(`
220+
tasks:
221+
- name: search weather
222+
flow:
223+
- ai: input 'weather today' in input box, click search button
224+
- sleep: 3000
225+
226+
- name: query weather
227+
flow:
228+
- aiQuery: "the result shows the weather info, {description: string}"
229+
name: weather
230+
`);
231+
```
232+
233+
For more details about the agent API, please refer to [API](./api).
234+
213235
## FAQ
214236

215237
**How to get cookies in JSON format from Chrome?**

apps/site/docs/zh/API.md

+32-10
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Midscene 中每个 Agent 都有自己的构造函数。
2121

2222
这些是 Midscene 中各类 Agent 的主要 API。
2323

24-
> 在以下文档中,你可能会看到带有 `mid.` 前缀的函数调用。如果你在 Playwright 中使用了解构赋值(object destructuring),如 `async ({ ai, aiQuery }) => { /* ... */}`,你可以不带这个前缀进行调用。这只是语法的区别。
24+
> 在以下文档中,你可能会看到带有 `agent.` 前缀的函数调用。如果你在 Playwright 中使用了解构赋值(object destructuring),如 `async ({ ai, aiQuery }) => { /* ... */}`,你可以不带这个前缀进行调用。这只是语法的区别。
2525
2626
### `.aiAction(steps: string)``.ai(steps: string)` - 控制界面
2727

@@ -32,11 +32,11 @@ Midscene 中每个 Agent 都有自己的构造函数。
3232
以下是一些正确示例:
3333

3434
```typescript
35-
await mid.aiAction('在任务框中输入 "Learn JS today",然后按回车键创建任务');
36-
await mid.aiAction('将鼠标移动到任务列表中的第二项,然后点击第二个任务右侧的删除按钮');
35+
await agent.aiAction('在任务框中输入 "Learn JS today",然后按回车键创建任务');
36+
await agent.aiAction('将鼠标移动到任务列表中的第二项,然后点击第二个任务右侧的删除按钮');
3737

3838
// 使用 `.ai` 简写
39-
await mid.ai('点击任务列表下方的 "completed" 状态按钮');
39+
await agent.ai('点击任务列表下方的 "completed" 状态按钮');
4040
```
4141

4242
务必使用清晰、详细的步骤描述。使用非常简略的指令(如 “发一条微博” )会导致非常不稳定的执行结果或运行失败。
@@ -62,7 +62,7 @@ await mid.ai('点击任务列表下方的 "completed" 状态按钮');
6262
例如,从页面解析详细信息:
6363

6464
```typescript
65-
const dataA = await mid.aiQuery({
65+
const dataA = await agent.aiQuery({
6666
time: '左上角展示的日期和时间,string',
6767
userInfo: '用户信息,{name: string}',
6868
tableFields: '表格的字段名,string[]',
@@ -72,18 +72,18 @@ const dataA = await mid.aiQuery({
7272
你也可以用纯字符串描述预期的返回值格式:
7373

7474
// dataB 将是一个字符串数组
75-
const dataB = await mid.aiQuery('string[],列表中的任务名称');
75+
const dataB = await agent.aiQuery('string[],列表中的任务名称');
7676

7777
// dataC 将是一个包含对象的数组
78-
const dataC = await mid.aiQuery('{name: string, age: string}[], 表格中的数据记录');
78+
const dataC = await agent.aiQuery('{name: string, age: string}[], 表格中的数据记录');
7979
```
8080

8181
### `.aiAssert(assertion: string, errorMsg?: string)` - 进行断言
8282

8383
`.aiAssert` 的功能类似于一般的断言(assert)方法,但可以用自然语言编写条件参数 `assertion`。Midscene 会调用 AI 来判断条件是否为真。若条件不满足,SDK 会抛出一个错误并在 `errorMsg` 后附上 AI 生成的错误原因。
8484

8585
```typescript
86-
await mid.aiAssert('"Sauce Labs Onesie" 的价格是 7.99');
86+
await agent.aiAssert('"Sauce Labs Onesie" 的价格是 7.99');
8787
```
8888

8989
:::tip
@@ -92,7 +92,7 @@ await mid.aiAssert('"Sauce Labs Onesie" 的价格是 7.99');
9292
例如你可以这么替代上一个断言代码:
9393

9494
```typescript
95-
const items = await mid.aiQuery(
95+
const items = await agent.aiQuery(
9696
'"{name: string, price: number}[], 返回商品名称和价格列表)',
9797
);
9898
const onesieItem = items.find(item => item.name === 'Sauce Labs Onesie');
@@ -108,7 +108,29 @@ expect(onesieItem.price).toBe(7.99);
108108
考虑到 AI 服务的时间消耗,`.aiWaitFor` 并不是一个特别高效的方法。使用一个普通的 `sleep` 可能是替代 `waitFor` 的另一种方式。
109109

110110
```typescript
111-
await mid.aiWaitFor("界面上至少有一个耳机的信息");
111+
await agent.aiWaitFor("界面上至少有一个耳机的信息");
112+
```
113+
114+
### `.runYaml(yamlScriptContent: string)` - 运行一个 yaml 脚本
115+
116+
`.runYaml` 会运行 yaml 脚本中的 `tasks` 部分,并返回所有 `.aiQuery` 调用的结果(如果存在此类调用)。yaml 脚本中的 `target` 部分将被忽略。
117+
118+
如果想要忽略 yaml 脚本运行中的错误,可以在 yaml 脚本中设置 `continueOnError` 选项。更多关于 yaml 脚本的信息,请参考 [Automate with Scripts in YAML](./automate-with-scripts-in-yaml)
119+
120+
```typescript
121+
const { result } = await agent.runYaml(`
122+
tasks:
123+
- name: search weather
124+
flow:
125+
- ai: input 'weather today' in input box, click search button
126+
- sleep: 3000
127+
128+
- name: query weather
129+
flow:
130+
- aiQuery: "the result shows the weather info, {description: string}"
131+
name: weather
132+
`);
133+
console.log(result);
112134
```
113135

114136
## 属性

apps/site/docs/zh/automate-with-scripts-in-yaml.mdx

+21
Original file line numberDiff line numberDiff line change
@@ -211,6 +211,27 @@ topic=weather today
211211

212212
请参阅 [通过 Chrome 扩展桥接模式](./bridge-mode-by-chrome-extension) 了解更多详细信息。
213213

214+
## 使用 JavaScript 运行 YAML 脚本
215+
216+
你也可以使用 JavaScript 运行 YAML 脚本,调用 Agent 上的 `runYaml` 方法即可。注意,这种方法只会执行 YAML 脚本中的 `tasks` 部分。
217+
218+
```typescript
219+
const { result } = await agent.runYaml(`
220+
tasks:
221+
- name: search weather
222+
flow:
223+
- ai: input 'weather today' in input box, click search button
224+
- sleep: 3000
225+
226+
- name: query weather
227+
flow:
228+
- aiQuery: "the result shows the weather info, {description: string}"
229+
name: weather
230+
`);
231+
```
232+
233+
更多关于 agent 的 API,请参考 [API](./api)
234+
214235
## FAQ
215236

216237
**如何从 Chrome 中获取 JSON 格式的 Cookies?**

packages/web-integration/src/common/agent.ts

+25
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ import {
1111
} from '@midscene/core';
1212
import { NodeType } from '@midscene/shared/constants';
1313

14+
import { ScriptPlayer, parseYamlScript } from '@/yaml';
1415
import { MIDSCENE_USE_VLM_UI_TARS, getAIConfig } from '@midscene/core/env';
1516
import {
1617
groupedActionDumpFileExt,
@@ -252,6 +253,30 @@ export class PageAgent<PageType extends WebPage = WebPage> {
252253
);
253254
}
254255

256+
async runYaml(yamlScriptContent: string): Promise<{
257+
result: Record<string, any>;
258+
}> {
259+
const script = parseYamlScript(yamlScriptContent, 'yaml', true);
260+
const player = new ScriptPlayer(script, async (target) => {
261+
return { agent: this, freeFn: [] };
262+
});
263+
await player.run();
264+
265+
if (player.status === 'error') {
266+
const errors = player.taskStatusList
267+
.filter((task) => task.status === 'error')
268+
.map((task) => {
269+
return `task - ${task.name}: ${task.error?.message}`;
270+
})
271+
.join('\n');
272+
throw new Error(`Error(s) occurred in running yaml script:\n${errors}`);
273+
}
274+
275+
return {
276+
result: player.result,
277+
};
278+
}
279+
255280
async destroy() {
256281
await this.page.destroy();
257282
}

packages/web-integration/src/yaml/player.ts

+1-1
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ export class ScriptPlayer {
3535
public onTaskStatusChange?: (taskStatus: ScriptPlayerTaskStatus) => void,
3636
) {
3737
this.result = {};
38-
this.output = script.target.output;
38+
this.output = script.target?.output;
3939
this.taskStatusList = (script.tasks || []).map((task, taskIndex) => ({
4040
...task,
4141
index: taskIndex,

packages/web-integration/src/yaml/utils.ts

+13-7
Original file line numberDiff line numberDiff line change
@@ -24,19 +24,25 @@ function interpolateEnvVars(content: string): string {
2424
export function parseYamlScript(
2525
content: string,
2626
filePath?: string,
27+
ignoreCheckingTarget?: boolean,
2728
): MidsceneYamlScript {
2829
const interpolatedContent = interpolateEnvVars(content);
2930
const obj = yaml.load(interpolatedContent) as MidsceneYamlScript;
3031
const pathTip = filePath ? `, failed to load ${filePath}` : '';
31-
assert(obj.target, `property "target" is required in yaml script${pathTip}`);
32-
assert(
33-
typeof obj.target === 'object',
34-
`property "target" must be an object${pathTip}`,
35-
);
36-
assert(obj.tasks, `property "tasks" is required in yaml script${pathTip}`);
32+
if (!ignoreCheckingTarget) {
33+
assert(
34+
obj.target,
35+
`property "target" is required in yaml script${pathTip}`,
36+
);
37+
assert(
38+
typeof obj.target === 'object',
39+
`property "target" must be an object${pathTip}`,
40+
);
41+
}
42+
assert(obj.tasks, `property "tasks" is required in yaml script ${pathTip}`);
3743
assert(
3844
Array.isArray(obj.tasks),
39-
`property "tasks" must be an array${pathTip}`,
45+
`property "tasks" must be an array in yaml script, but got ${obj.tasks}`,
4046
);
4147
return obj;
4248
}

0 commit comments

Comments
 (0)