Skip to content

Commit 0b35e40

Browse files
docs: add StagehandTool documentation and improve MDX structure (#2842)
1 parent 49bbf3f commit 0b35e40

9 files changed

+245
-14
lines changed

docs/docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,7 @@
129129
"tools/seleniumscrapingtool",
130130
"tools/snowflakesearchtool",
131131
"tools/spidertool",
132+
"tools/stagehandtool",
132133
"tools/txtsearchtool",
133134
"tools/visiontool",
134135
"tools/weaviatevectorsearchtool",

docs/guides/advanced/customizing-prompts.mdx

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ description: Dive deeper into low-level prompt customization for CrewAI, enablin
44
icon: message-pen
55
---
66

7-
# Customizing Prompts at a Low Level
8-
97
## Why Customize Prompts?
108

119
Although CrewAI's default prompts work well for many scenarios, low-level customization opens the door to significantly more flexible and powerful agent behavior. Here’s why you might want to take advantage of this deeper control:

docs/guides/advanced/fingerprinting.mdx

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ description: Learn how to use CrewAI's fingerprinting system to uniquely identif
44
icon: fingerprint
55
---
66

7-
# Fingerprinting in CrewAI
8-
97
## Overview
108

119
Fingerprints in CrewAI provide a way to uniquely identify and track components throughout their lifecycle. Each `Agent`, `Crew`, and `Task` automatically receives a unique fingerprint when created, which cannot be manually overridden.

docs/guides/agents/crafting-effective-agents.mdx

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ description: Learn best practices for designing powerful, specialized AI agents
44
icon: robot
55
---
66

7-
# Crafting Effective Agents
8-
97
## The Art and Science of Agent Design
108

119
At the heart of CrewAI lies the agent - a specialized AI entity designed to perform specific roles within a collaborative framework. While creating basic agents is simple, crafting truly effective agents that produce exceptional results requires understanding key design principles and best practices.

docs/guides/concepts/evaluating-use-cases.mdx

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ description: Learn how to assess your AI application needs and choose the right
44
icon: scale-balanced
55
---
66

7-
# Evaluating Use Cases for CrewAI
8-
97
## Understanding the Decision Framework
108

119
When building AI applications with CrewAI, one of the most important decisions you'll make is choosing the right approach for your specific use case. Should you use a Crew? A Flow? A combination of both? This guide will help you evaluate your requirements and make informed architectural decisions.

docs/guides/crews/first-crew.mdx

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ description: Step-by-step tutorial to create a collaborative AI team that works
44
icon: users-gear
55
---
66

7-
# Build Your First Crew
8-
97
## Unleashing the Power of Collaborative AI
108

119
Imagine having a team of specialized AI agents working together seamlessly to solve complex problems, each contributing their unique skills to achieve a common goal. This is the power of CrewAI - a framework that enables you to create collaborative AI systems that can accomplish tasks far beyond what a single AI could achieve alone.

docs/guides/flows/first-flow.mdx

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ description: Learn how to create structured, event-driven workflows with precise
44
icon: diagram-project
55
---
66

7-
# Build Your First Flow
8-
97
## Taking Control of AI Workflows with Flows
108

119
CrewAI Flows represent the next level in AI orchestration - combining the collaborative power of AI agent crews with the precision and flexibility of procedural programming. While crews excel at agent collaboration, flows give you fine-grained control over exactly how and when different components of your AI system interact.

docs/guides/flows/mastering-flow-state.mdx

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ description: A comprehensive guide to managing, persisting, and leveraging state
44
icon: diagram-project
55
---
66

7-
# Mastering Flow State Management
8-
97
## Understanding the Power of State in Flows
108

119
State management is the backbone of any sophisticated AI workflow. In CrewAI Flows, the state system allows you to maintain context, share data between steps, and build complex application logic. Mastering state management is essential for creating reliable, maintainable, and powerful AI applications.

docs/tools/stagehandtool.mdx

Lines changed: 244 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,244 @@
1+
---
2+
title: Stagehand Tool
3+
description: Web automation tool that integrates Stagehand with CrewAI for browser interaction and automation
4+
icon: hand
5+
---
6+
7+
8+
# Overview
9+
10+
The `StagehandTool` integrates the [Stagehand](https://docs.stagehand.dev/get_started/introduction) framework with CrewAI, enabling agents to interact with websites and automate browser tasks using natural language instructions.
11+
12+
## Overview
13+
14+
Stagehand is a powerful browser automation framework built by Browserbase that allows AI agents to:
15+
16+
- Navigate to websites
17+
- Click buttons, links, and other elements
18+
- Fill in forms
19+
- Extract data from web pages
20+
- Observe and identify elements
21+
- Perform complex workflows
22+
23+
The StagehandTool wraps the Stagehand Python SDK to provide CrewAI agents with browser control capabilities through three core primitives:
24+
25+
1. **Act**: Perform actions like clicking, typing, or navigating
26+
2. **Extract**: Extract structured data from web pages
27+
3. **Observe**: Identify and analyze elements on the page
28+
29+
## Prerequisites
30+
31+
Before using this tool, ensure you have:
32+
33+
1. A [Browserbase](https://www.browserbase.com/) account with API key and project ID
34+
2. An API key for an LLM (OpenAI or Anthropic Claude)
35+
3. The Stagehand Python SDK installed
36+
37+
Install the required dependency:
38+
39+
```bash
40+
pip install stagehand-py
41+
```
42+
43+
## Usage
44+
45+
### Basic Implementation
46+
47+
The StagehandTool can be implemented in two ways:
48+
49+
#### 1. Using Context Manager (Recommended)
50+
<Tip>
51+
The context manager approach is recommended as it ensures proper cleanup of resources even if exceptions occur.
52+
</Tip>
53+
54+
```python
55+
from crewai import Agent, Task, Crew
56+
from crewai_tools import StagehandTool
57+
from stagehand.schemas import AvailableModel
58+
59+
# Initialize the tool with your API keys using a context manager
60+
with StagehandTool(
61+
api_key="your-browserbase-api-key",
62+
project_id="your-browserbase-project-id",
63+
model_api_key="your-llm-api-key", # OpenAI or Anthropic API key
64+
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST, # Optional: specify which model to use
65+
) as stagehand_tool:
66+
# Create an agent with the tool
67+
researcher = Agent(
68+
role="Web Researcher",
69+
goal="Find and summarize information from websites",
70+
backstory="I'm an expert at finding information online.",
71+
verbose=True,
72+
tools=[stagehand_tool],
73+
)
74+
75+
# Create a task that uses the tool
76+
research_task = Task(
77+
description="Go to https://www.example.com and tell me what you see on the homepage.",
78+
agent=researcher,
79+
)
80+
81+
# Run the crew
82+
crew = Crew(
83+
agents=[researcher],
84+
tasks=[research_task],
85+
verbose=True,
86+
)
87+
88+
result = crew.kickoff()
89+
print(result)
90+
```
91+
92+
#### 2. Manual Resource Management
93+
94+
```python
95+
from crewai import Agent, Task, Crew
96+
from crewai_tools import StagehandTool
97+
from stagehand.schemas import AvailableModel
98+
99+
# Initialize the tool with your API keys
100+
stagehand_tool = StagehandTool(
101+
api_key="your-browserbase-api-key",
102+
project_id="your-browserbase-project-id",
103+
model_api_key="your-llm-api-key",
104+
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
105+
)
106+
107+
try:
108+
# Create an agent with the tool
109+
researcher = Agent(
110+
role="Web Researcher",
111+
goal="Find and summarize information from websites",
112+
backstory="I'm an expert at finding information online.",
113+
verbose=True,
114+
tools=[stagehand_tool],
115+
)
116+
117+
# Create a task that uses the tool
118+
research_task = Task(
119+
description="Go to https://www.example.com and tell me what you see on the homepage.",
120+
agent=researcher,
121+
)
122+
123+
# Run the crew
124+
crew = Crew(
125+
agents=[researcher],
126+
tasks=[research_task],
127+
verbose=True,
128+
)
129+
130+
result = crew.kickoff()
131+
print(result)
132+
finally:
133+
# Explicitly clean up resources
134+
stagehand_tool.close()
135+
```
136+
137+
## Command Types
138+
139+
The StagehandTool supports three different command types for specific web automation tasks:
140+
141+
### 1. Act Command
142+
143+
The `act` command type (default) enables webpage interactions like clicking buttons, filling forms, and navigation.
144+
145+
```python
146+
# Perform an action (default behavior)
147+
result = stagehand_tool.run(
148+
instruction="Click the login button",
149+
url="https://example.com",
150+
command_type="act" # Default, so can be omitted
151+
)
152+
153+
# Fill out a form
154+
result = stagehand_tool.run(
155+
instruction="Fill the contact form with name 'John Doe', email '[email protected]', and message 'Hello world'",
156+
url="https://example.com/contact"
157+
)
158+
```
159+
160+
### 2. Extract Command
161+
162+
The `extract` command type retrieves structured data from webpages.
163+
164+
```python
165+
# Extract all product information
166+
result = stagehand_tool.run(
167+
instruction="Extract all product names, prices, and descriptions",
168+
url="https://example.com/products",
169+
command_type="extract"
170+
)
171+
172+
# Extract specific information with a selector
173+
result = stagehand_tool.run(
174+
instruction="Extract the main article title and content",
175+
url="https://example.com/blog/article",
176+
command_type="extract",
177+
selector=".article-container" # Optional CSS selector
178+
)
179+
```
180+
181+
### 3. Observe Command
182+
183+
The `observe` command type identifies and analyzes webpage elements.
184+
185+
```python
186+
# Find interactive elements
187+
result = stagehand_tool.run(
188+
instruction="Find all interactive elements in the navigation menu",
189+
url="https://example.com",
190+
command_type="observe"
191+
)
192+
193+
# Identify form fields
194+
result = stagehand_tool.run(
195+
instruction="Identify all the input fields in the registration form",
196+
url="https://example.com/register",
197+
command_type="observe",
198+
selector="#registration-form"
199+
)
200+
```
201+
202+
## Configuration Options
203+
204+
Customize the StagehandTool behavior with these parameters:
205+
206+
```python
207+
stagehand_tool = StagehandTool(
208+
api_key="your-browserbase-api-key",
209+
project_id="your-browserbase-project-id",
210+
model_api_key="your-llm-api-key",
211+
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
212+
dom_settle_timeout_ms=5000, # Wait longer for DOM to settle
213+
headless=True, # Run browser in headless mode
214+
self_heal=True, # Attempt to recover from errors
215+
wait_for_captcha_solves=True, # Wait for CAPTCHA solving
216+
verbose=1, # Control logging verbosity (0-3)
217+
)
218+
```
219+
220+
## Best Practices
221+
222+
1. **Be Specific**: Provide detailed instructions for better results
223+
2. **Choose Appropriate Command Type**: Select the right command type for your task
224+
3. **Use Selectors**: Leverage CSS selectors to improve accuracy
225+
4. **Break Down Complex Tasks**: Split complex workflows into multiple tool calls
226+
5. **Implement Error Handling**: Add error handling for potential issues
227+
228+
## Troubleshooting
229+
230+
231+
Common issues and solutions:
232+
233+
- **Session Issues**: Verify API keys for both Browserbase and LLM provider
234+
- **Element Not Found**: Increase `dom_settle_timeout_ms` for slower pages
235+
- **Action Failures**: Use `observe` to identify correct elements first
236+
- **Incomplete Data**: Refine instructions or provide specific selectors
237+
238+
239+
## Additional Resources
240+
241+
For questions about the CrewAI integration:
242+
- Join Stagehand's [Slack community](https://stagehand.dev/slack)
243+
- Open an issue in the [Stagehand repository](https://github.com/browserbase/stagehand)
244+
- Visit [Stagehand documentation](https://docs.stagehand.dev/)

0 commit comments

Comments
 (0)