This is a project that uses Stagehand V3, a browser automation framework with AI-powered act, extract, observe, and agent methods.
The main class can be imported as Stagehand from @browserbasehq/stagehand.
Key Classes:
Stagehand: Main orchestrator class providingact,extract,observe, andagentmethodscontext: AV3Contextobject that manages browser contexts and pagespage: Individual page objects accessed viastagehand.context.pages()[i]or created withstagehand.context.newPage()
import { Stagehand } from "@browserbasehq/stagehand";
const stagehand = new Stagehand({
env: "LOCAL", // or "BROWSERBASE"
verbose: 2, // 0, 1, or 2
model: "openai/gpt-4.1-mini", // or any supported model
});
await stagehand.init();
// Access the browser context and pages
const page = stagehand.context.pages()[0];
const context = stagehand.context;
// Create new pages if needed
const page2 = await stagehand.context.newPage();Actions are called on the stagehand instance (not the page). Use atomic, specific instructions:
// Act on the current active page
await stagehand.act("click the sign in button");
// Act on a specific page (when you need to target a page that isn't currently active)
await stagehand.act("click the sign in button", { page: page2 });Important: Act instructions should be atomic and specific:
- ✅ Good: "Click the sign in button" or "Type 'hello' into the search input"
- ❌ Bad: "Order me pizza" or "Type in the search bar and hit enter" (multi-step)
Cache the results of observe to avoid unexpected DOM changes:
const instruction = "Click the sign in button";
// Get candidate actions
const actions = await stagehand.observe(instruction);
// Execute the first action
await stagehand.act(actions[0]);To target a specific page:
const actions = await stagehand.observe("select blue as the favorite color", {
page: page2,
});
await stagehand.act(actions[0], { page: page2 });Extract data from pages using natural language instructions. The extract method is called on the stagehand instance.
import { z } from "zod";
// Extract with explicit schema
const data = await stagehand.extract(
"extract all apartment listings with prices and addresses",
z.object({
listings: z.array(
z.object({
price: z.string(),
address: z.string(),
}),
),
}),
);
console.log(data.listings);// Extract returns a default object with 'extraction' field
const result = await stagehand.extract("extract the sign in button text");
console.log(result);
// Output: { extraction: "Sign in" }
// Or destructure directly
const { extraction } = await stagehand.extract(
"extract the sign in button text",
);
console.log(extraction); // "Sign in"Extract data from a specific element using a selector:
const reason = await stagehand.extract(
"extract the reason why script injection fails",
z.string(),
{ selector: "/html/body/div[2]/div[3]/iframe/html/body/p[2]" },
);When extracting links or URLs, use z.string().url():
const { links } = await stagehand.extract(
"extract all navigation links",
z.object({
links: z.array(z.string().url()),
}),
);// Extract from a specific page (when you need to target a page that isn't currently active)
const data = await stagehand.extract(
"extract the placeholder text on the name field",
{ page: page2 },
);Plan actions before executing them. Returns an array of candidate actions:
// Get candidate actions on the current active page
const [action] = await stagehand.observe("Click the sign in button");
// Execute the action
await stagehand.act(action);Observing on a specific page:
// Target a specific page (when you need to target a page that isn't currently active)
const actions = await stagehand.observe("find the next page button", {
page: page2,
});
await stagehand.act(actions[0], { page: page2 });Use the agent method to autonomously execute complex, multi-step tasks.
const page = stagehand.context.pages()[0];
await page.goto("https://www.google.com");
const agent = stagehand.agent({
model: "google/gemini-2.0-flash",
executionModel: "google/gemini-2.0-flash",
});
const result = await agent.execute({
instruction: "Search for the stock price of NVDA",
maxSteps: 20,
});
console.log(result.message);For more advanced scenarios using computer-use models:
const agent = stagehand.agent({
mode: "cua", // Enable Computer Use Agent mode
model: "anthropic/claude-sonnet-4-20250514",
// or "google/gemini-2.5-computer-use-preview-10-2025"
systemPrompt: `You are a helpful assistant that can use a web browser.
Do not ask follow up questions, the user will trust your judgement.`,
});
await agent.execute({
instruction: "Apply for a library card at the San Francisco Public Library",
maxSteps: 30,
});const agent = stagehand.agent({
model: {
modelName: "google/gemini-2.5-computer-use-preview-10-2025",
apiKey: process.env.GEMINI_API_KEY,
},
systemPrompt: `You are a helpful assistant.`,
});const agent = stagehand.agent({
integrations: [`https://mcp.exa.ai/mcp?exaApiKey=${process.env.EXA_API_KEY}`],
systemPrompt: `You have access to the Exa search tool.`,
});Hybrid mode uses both DOM-based and coordinate-based tools (act, click, type, dragAndDrop) for visual interactions. This requires experimental: true and models that support reliable coordinate-based actions.
Recommended models for hybrid mode:
google/gemini-3-flash-previewanthropic/claude-sonnet-4-20250514,anthropic/claude-sonnet-4-5-20250929,anthropic/claude-haiku-4-5-20251001
const stagehand = new Stagehand({
env: "LOCAL",
experimental: true, // Required for hybrid mode
});
await stagehand.init();
const agent = stagehand.agent({
mode: "hybrid",
model: "google/gemini-3-flash-preview",
});
await agent.execute({
instruction: "Click the submit button and fill the form",
maxSteps: 20,
highlightCursor: true, // Enabled by default in hybrid mode
});Agent modes:
"dom"(default): Uses DOM-based tools (act, fillForm) - works with any model"hybrid": Uses both DOM-based and coordinate-based tools (act, click, type, dragAndDrop) - requires grounding-capable models"cua": Uses Computer Use Agent providers
Target specific elements across shadow DOM and iframes:
await page
.deepLocator("/html/body/div[2]/div[3]/iframe/html/body/p")
.highlight({
durationMs: 5000,
contentColor: { r: 255, g: 0, b: 0 },
});const page1 = stagehand.context.pages()[0];
await page1.goto("https://example.com");
const page2 = await stagehand.context.newPage();
await page2.goto("https://example2.com");
// Act/extract/observe operate on the current active page by default
// Pass { page } option to target a specific page
await stagehand.act("click button", { page: page1 });
await stagehand.extract("get title", { page: page2 });