Skip to content

Commit

Permalink
chore: merge main branch
Browse files Browse the repository at this point in the history
  • Loading branch information
zhoushaw committed Sep 29, 2024
2 parents c17f870 + 032b505 commit b8cc35a
Show file tree
Hide file tree
Showing 48 changed files with 32,956 additions and 5,178 deletions.
22 changes: 14 additions & 8 deletions apps/site/docs/en/docs/more/prompting-tips.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,19 @@ Bad ❌: "[number, number], the [x, y] coords of the main button"

Use the visualization tool to debug and understand each step of Midscene. Just upload the log, and view the AI's parse results. You can find [the tool](/visualization/) on the navigation bar on this site.

### Remember to cross-check the result by assertion
### Infer or assert from the interface, not the DOM properties or browser status

All the data sent to the LLM is in the form of screenshots and element coordinates. The DOM and the browser instance are almost invisible to the LLM. Therefore, ensure everything you expect is visible in the on the screen.

Good ✅: The title is blue

Bad ❌: The title has a `test-id-size` property

Bad ❌: The browser has two active tabs

Bad ❌: The request has finished.

### Cross-check the result using assertion

LLM could behave incorrectly. A better practice is to check its result after running.

Expand All @@ -49,13 +61,7 @@ expect(taskList.length).toBe(1);
expect(taskList[0]).toBe('Learning AI the day after tomorrow');
```

### Infer from the UI, not the DOM properties

All the data sent to the LLM are the screenshots and element coordinates. The DOM is almost invisible to the LLM. So do not expect the LLM infer any information from the DOM (such as `test-id-*` properties).

Ensure everything you expect from the LLM is visible in the screenshot.

### non-English prompting is acceptable
### Non-English prompting is acceptable

Since most AI models can understand many languages, feel free to write the prompt in any language you prefer. It usually works even if the prompt is in a language different from the page's language.

Expand Down
4 changes: 3 additions & 1 deletion apps/site/docs/en/docs/usage/cache.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Cache

Midscene.js provides AI caching capabilities to improve the stability and speed of the entire AI execution process. The cache here mainly refers to caching AI's recognition of page elements. When the page elements have not changed, the AI query results will be cached.
Midscene.js provides AI caching features to improve the stability and speed of the entire AI execution process. The cache mainly refers to caching how AI recognizes page elements. Cached AI query results are used if page elements haven't changed.

## Instructions

Expand All @@ -15,6 +15,8 @@ Currently, the caching capability is supported in all scenarios, and Midscene ca

**Effect**

After enabling the cache, the execution time is significantly reduced, for example, from 1m16s to 23s.

* **before**

![](/cache/no-cache-time.png)
Expand Down
18 changes: 12 additions & 6 deletions apps/site/docs/zh/docs/more/prompting-tips.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,18 @@

使用可视化工具调试和理解 Midscene 的每个步骤。只需上传日志,就可以查看 AI 的解析结果。你可以在本站导航栏上找到 [可视化工具](/visualization/)

### 从界面做推断,而不是 DOM 属性或者浏览器状态

所有传递给 LLM 的数据都是截图和元素坐标。DOM和浏览器 对 LLM 来说几乎是不可见的。因此,务必确保你想提取的信息都在截图中有所体现且能被 LLM “看到”。

正确示例 ✅:标题是蓝色的

错误实例 ❌:标题有个 `test-id-size` 属性

错误实例 ❌:浏览器有两个 tab 开着

错误实例 ❌:异步请求已经结束了

### 通过断言交叉检查结果

LLM 可能会表现出错误的行为。更好的做法是运行操作后检查其结果。
Expand All @@ -48,12 +60,6 @@ expect(taskList.length).toBe(1);
expect(taskList[0]).toBe('后天学习 AI');
```

### 从界面而不是 DOM 属性推断信息

所有传递给 LLM 的数据都是截图和元素坐标。DOM 对 LLM 来说几乎是不可见的。因此,不要指望 LLM 能从 DOM 中推断任何信息(比如 `test-id-*` 属性)。

务必确保你想提取的信息都在截图中有所体现且能被 LLM “看到”。

### 中、英文提示词都是可行的

由于大多数 AI 模型可以理解多种语言,所以请随意用你喜欢的语言撰写提示指令。即使提示语言与页面语言不同,通常也是可行的。
2 changes: 2 additions & 0 deletions apps/site/docs/zh/docs/usage/cache.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ Midscene.js 提供了 AI 缓存能力,用于提升整个 AI 执行过程的稳

**使用效果**

通过引入缓存后,用例的执行时间大幅降低了,例如从1分16秒降低到了23秒。

* **before**

![](/cache/no-cache-time.png)
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "midscene",
"private": true,
"version": "0.4.0",
"version": "0.5.1",
"scripts": {
"build:pkg": "nx run-many --target=build --projects=@midscene/core,@midscene/shared,@midscene/visualizer,@midscene/web,@midscene/cli --verbose",
"test": "nx run-many --target=test --projects=@midscene/core,--projects=@midscene/shared,@midscene/visualizer,@midscene/web,@midscene/cli --verbose",
Expand Down
2 changes: 1 addition & 1 deletion packages/cli/package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "@midscene/cli",
"description": "Cli for Midscene.js",
"version": "0.4.0",
"version": "0.5.1",
"jsnext:source": "./src/index.ts",
"main": "./dist/lib/index.js",
"bin": {
Expand Down
2 changes: 1 addition & 1 deletion packages/midscene/package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "@midscene/core",
"description": "Hello, It's Midscene",
"version": "0.4.0",
"version": "0.5.1",
"jsnext:source": "./src/index.ts",
"main": "./dist/lib/index.js",
"module": "./dist/es/index.js",
Expand Down
4 changes: 3 additions & 1 deletion packages/midscene/src/ai-model/common.ts
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,9 @@ export async function callAiFn<T>(options: {
return parseResult;
}

throw Error('Does not contain coze and openai environment variables');
throw Error(
'Cannot find Coze or OpenAI config. You should set at least one of them.',
);
}

export function transformUserMessages(msgs: ChatCompletionContentPart[]) {
Expand Down
2 changes: 0 additions & 2 deletions packages/midscene/src/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,6 @@ export async function sleep(ms: number) {
return new Promise((resolve) => setTimeout(resolve, ms));
}

export const commonScreenshotParam = { type: 'jpeg', quality: 75 } as any;

export function replacerForPageObject(key: string, value: any) {
if (value && value.constructor?.name === 'Page') {
return '[Page object]';
Expand Down
2 changes: 1 addition & 1 deletion packages/shared/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@midscene/shared",
"version": "0.4.0",
"version": "0.5.1",
"types": "./src/index.ts",
"main": "./dist/lib/index.js",
"module": "./dist/es/index.js",
Expand Down
2 changes: 1 addition & 1 deletion packages/visualizer/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@midscene/visualizer",
"version": "0.4.0",
"version": "0.5.1",
"types": "./dist/types/index.d.ts",
"main": "./dist/lib/index.js",
"module": "./dist/es/index.js",
Expand Down
11 changes: 11 additions & 0 deletions packages/visualizer/scripts/build-html.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,11 @@ const htmlPath = join(__dirname, '../html/tpl.html');
const cssPath = join(__dirname, '../dist/report/index.css');
const jsPath = join(__dirname, '../dist/report/index.js');
const demoPath = join(__dirname, './fixture/demo-dump.json');
const demoMobilePath = join(__dirname, './fixture/demo-mobile-dump.json');
const multiEntrySegment = join(__dirname, './fixture/multi-entries.html');
const outputHTML = join(__dirname, '../dist/report/index.html');
const outputDemoHTML = join(__dirname, '../dist/report/demo.html');
const outputDemoMobileHTML = join(__dirname, '../dist/report/demo-mobile.html');
const outputMultiEntriesHTML = join(__dirname, '../dist/report/multi.html');
const outputEmptyDumpHTML = join(__dirname, '../dist/report/empty-error.html');

Expand Down Expand Up @@ -75,6 +77,15 @@ function build() {
writeFileSync(outputDemoHTML, resultWithDemo);
console.log(`HTML file generated successfully: ${outputDemoHTML}`);

const demoMobileData = readFileSync(demoMobilePath, 'utf-8');
const resultWithDemoMobile = tplReplacer(html, {
css: `<style>\n${css}\n</style>\n`,
js: `<script>\n${js}\n</script>`,
dump: `<script type="midscene_web_dump" type="application/json">${demoMobileData}</script>`,
});
writeFileSync(outputDemoMobileHTML, resultWithDemoMobile);
console.log(`HTML file generated successfully: ${outputDemoMobileHTML}`);

const multiEntriesData = readFileSync(multiEntrySegment, 'utf-8');
const resultWithMultiEntries = tplReplacer(html, {
css: `<style>\n${css}\n</style>\n`,
Expand Down
9,864 changes: 4,932 additions & 4,932 deletions packages/visualizer/scripts/fixture/demo-dump.json

Large diffs are not rendered by default.

26,749 changes: 26,749 additions & 0 deletions packages/visualizer/scripts/fixture/demo-mobile-dump.json

Large diffs are not rendered by default.

78 changes: 38 additions & 40 deletions packages/visualizer/src/component/blackboard.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,53 @@ import { colorForName, highlightColorForType } from './color';
import './blackboard.less';
import { useBlackboardPreference, useInsightDump } from './store';

const itemFillAlpha = 0.3;
const itemFillAlpha = 0.4;
const highlightAlpha = 0.7;
const bgOnAlpha = 0.8;
const bgOffAlpha = 0.3;
const noop = () => {
// noop
};

export const rectMarkForItem = (
rect: Rect,
name: string,
ifHighlight: boolean,
onPointOver?: () => void,
onPointerOut?: () => void,
) => {
const { left, top, width, height } = rect;
const themeColor = ifHighlight
? highlightColorForType('element')
: colorForName(name);
const alpha = ifHighlight ? highlightAlpha : itemFillAlpha;
const graphics = new PIXI.Graphics();
graphics.beginFill(themeColor, alpha);
graphics.lineStyle(1, themeColor, 1);
graphics.drawRect(left, top, width, height);
graphics.endFill();
if (onPointOver && onPointerOut) {
graphics.interactive = true;
graphics.on('pointerover', onPointOver);
graphics.on('pointerout', onPointerOut);
}

const nameFontSize = 18;
const texts = new PIXI.Text(name, {
fontSize: nameFontSize,
fill: 0x0,
});
texts.x = left;
texts.y = Math.max(top - (nameFontSize + 4), 0);
return [graphics, texts];
};

const BlackBoard = (): JSX.Element => {
const dump = useInsightDump((store) => store.data);
const setHighlightElements = useInsightDump(
(store) => store.setHighlightElements,
);
const highlightSectionNames = useInsightDump(
(store) => store.highlightSectionNames,
);

const highlightElements = useInsightDump((store) => store.highlightElements);
const highlightIds = highlightElements.map((e) => e.id);

Expand Down Expand Up @@ -107,34 +139,6 @@ const BlackBoard = (): JSX.Element => {
};
}, [app.stage, appInitialed]);

const rectMarkForItem = (
rect: Rect,
name: string,
themeColor: string,
alpha: number,
onPointOver: () => void,
onPointerOut: () => void,
) => {
const { left, top, width, height } = rect;
const graphics = new PIXI.Graphics();
graphics.beginFill(themeColor, alpha);
graphics.lineStyle(1, themeColor, 1);
graphics.drawRect(left, top, width, height);
graphics.endFill();
graphics.interactive = true;
graphics.on('pointerover', onPointOver);
graphics.on('pointerout', onPointerOut);

const nameFontSize = 18;
const texts = new PIXI.Text(name, {
fontSize: nameFontSize,
fill: 0x0,
});
texts.x = left;
texts.y = Math.max(top - (nameFontSize + 4), 0);
return [graphics, texts];
};

const { highlightElementRects } = useMemo(() => {
const highlightElementRects: Rect[] = [];

Expand All @@ -149,10 +153,7 @@ const BlackBoard = (): JSX.Element => {
const [graphics] = rectMarkForItem(
rect,
content,
ifHighlight
? highlightColorForType('element')
: colorForName('element', content),
ifHighlight ? 1 : itemFillAlpha,
ifHighlight,
noop,
noop,
);
Expand All @@ -162,10 +163,7 @@ const BlackBoard = (): JSX.Element => {
const [graphics] = rectMarkForItem(
rect,
content,
ifHighlight
? highlightColorForType('element')
: colorForName('element', content),
ifHighlight ? 1 : itemFillAlpha,
ifHighlight,
() => {
setHighlightElements([element]);
},
Expand Down
17 changes: 5 additions & 12 deletions packages/visualizer/src/component/color.tsx
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
// https://coolors.co/palettes/popular/#01204e
const sectionColor = ['#028391'];
const elementColor = ['#fb6107'];
// const elementColor = ['#fb6107'];
const elementColor = ['#01204E'];
const highlightColorForSection = '#01204E';
const highlightColorForElement = '#F56824'; // @main-orange

function djb2Hash(str?: string): number {
if (!str) {
console.warn('djb2Hash: empty string');
// console.warn('djb2Hash: empty string');
str = 'unnamed';
}
let hash = 5381;
Expand All @@ -16,20 +17,12 @@ function djb2Hash(str?: string): number {
return hash >>> 0; // Convert to unsigned 32
}

export function colorForName(
type: 'section' | 'element',
name: string,
): string {
export function colorForName(name: string): string {
const hashNumber = djb2Hash(name);
if (type === 'section') {
return sectionColor[hashNumber % sectionColor.length];
}
return elementColor[hashNumber % elementColor.length];
}

export function highlightColorForType(type: 'section' | 'element'): string {
if (type === 'section') {
return highlightColorForSection;
}
// return highlightColorForSection;
return highlightColorForElement;
}
10 changes: 10 additions & 0 deletions packages/visualizer/src/component/detail-panel.less
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,15 @@
display: flex;
flex-direction: column;
height: 100%;
box-sizing: border-box;
padding: @layout-space;
background: #FFF;

.scrollable {
height: 100%;
overflow: auto;
}

.ant-segmented{
padding: 0;
}
Expand All @@ -24,6 +30,8 @@
flex-direction: column;
display: flex;
flex-grow: 1;
height: 100%;
overflow: hidden;
}

.blackboard {
Expand All @@ -48,6 +56,8 @@
img {
border: 1px solid @heavy-border-color;
max-width: 100%;
max-height: 720px;
box-sizing: border-box;
}
}

Expand Down
Loading

0 comments on commit b8cc35a

Please sign in to comment.