Skip to content

Commit 378e526

Browse files
committed
refactor: remove outdated browser automation guide and add comprehensive browser automation rules documentation
- Deleted the obsolete browser-automation-guide.mdc file. - Introduced a new browser-automation.rules.mdc file containing detailed instructions and examples for using the browser-use CLI tool, including key commands, common options, and provider-model references. - Updated the documentation to enhance clarity and usability for users performing browser automation tasks.
1 parent e720f7e commit 378e526

File tree

3 files changed

+82
-26
lines changed

3 files changed

+82
-26
lines changed

.cursor/rules/browser-automation-guide.mdc

Lines changed: 0 additions & 1 deletion
This file was deleted.
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
---
2+
description: Guide for using the browser-use CLI tool. Request when performing browser automation tasks.
3+
globs:
4+
alwaysApply: false
5+
---
6+
# Browser Automation (browser-use) Guide
7+
# Last Updated: 2025-03-31 10:13:02 AM
8+
9+
## Browser Automation with browser-use
10+
11+
The browser-use CLI tool allows you to control a browser using natural language instructions.
12+
13+
### Key Commands
14+
15+
1. **Start a browser session**:
16+
17+
```bash
18+
browser-use start
19+
```
20+
21+
2. **Run a task in an existing browser**:
22+
23+
```bash
24+
browser-use run "your task instruction" --url "https://example.com"
25+
```
26+
27+
3. **Close the browser session**:
28+
29+
```bash
30+
browser-use close
31+
```
32+
33+
### Common Options
34+
35+
- `--provider` - Choose LLM provider: `Deepseek` (default), `Google`, `OpenAI`, `Anthropic`
36+
- `--vision` - Enable vision capabilities (automatically selects vision-capable models)
37+
- `--headless` - Run browser in headless mode
38+
- `--record` - Enable session recording
39+
- `--record-path` - Path to save recordings
40+
- `--max-steps` - Maximum number of steps per task (default: 10)
41+
- `--max-actions` - Maximum actions per step (default: 1)
42+
- `--add-info` - Additional context for the agent
43+
44+
### Examples
45+
46+
1. **Basic browsing task**:
47+
48+
```bash
49+
browser-use run "search for OpenAI" --url "https://www.google.com"
50+
```
51+
52+
2. **Visual analysis with Google Gemini**:
53+
54+
```bash
55+
browser-use run "analyze the visual layout" --url "https://www.openai.com" --provider Google --vision
56+
```
57+
58+
3. **Complex analysis with OpenAI**:
59+
60+
```bash
61+
browser-use run "analyze the layout and design" --url "https://www.example.com" --provider OpenAI --vision
62+
```
63+
64+
4. **Debugging with recording**:
65+
66+
```bash
67+
browser-use run "test the login process" --url "https://example.com" --record --record-path ./debug_session
68+
```
69+
70+
### Provider-Model Reference
71+
72+
- **Deepseek**: `deepseek-chat`
73+
- **Google**: `gemini-1.5-pro` (default), `gemini-2.0-flash` (model-index 1)
74+
- **OpenAI**: `gpt-4o` (vision-capable)
75+
- **Anthropic**: `claude-3-5-sonnet-latest` (default), `claude-3-5-sonnet-20241022` (model-index 1)
76+
77+
### Tips
78+
79+
- Always include the full URL with protocol (https://)
80+
- For complex tasks, increase the `--max-steps` parameter
81+
- Use `--add-info` to provide additional context not covered in the main prompt
82+
- Enable `--vision` when visual analysis is needed

app/projects/page.tsx

Lines changed: 0 additions & 25 deletions
This file was deleted.

0 commit comments

Comments
 (0)