Agentic Browser is an agent-driven system that lets you control a browser using plain English. Built on the PydanticAI Python agent framework, it handles everything from filling out forms and searching e-commerce sites to pulling content, interacting with media, and managing projects across different platforms — all without writing a single line of automation code.
- Web Research and Analysis: Runs intelligent research across academic papers, travel sites, and code repositories using natural language queries.
- Data Extraction: Collects and compiles structured data — sports stats, historical records, stock prices, currency rates, and more.
- E-commerce Information: Scrapes product details like pricing, specifications, and availability across major e-commerce platforms.
- Web Traversal: Navigates across domains with context-aware logic, correlating data as it moves between sites.
Three specialized agents work together to handle every task:
-
Planner Agent: The strategist. It breaks your request into clear, executable steps and adapts the plan as things progress.
-
Browser Agent: The executor. It directly interacts with web pages — clicking, typing, navigating, and extracting information using browser automation tools.
-
Critique Agent: The quality controller. It reviews what happened, checks the results, and decides whether the task is done or needs another pass.
Together, they run a continuous feedback loop until the task is complete.
- The Planner Agent receives your request
- Breaks down what needs to happen
- Produces a step-by-step execution plan
- Identifies the first action to take
- The Browser Agent picks up the current step
- Carries out precise browser actions — navigation, clicks, text entry
- Uses DOM inspection and screenshot analysis as needed
- Reports back with results
- The Critique Agent reviews what the Browser Agent did
- Checks screenshots and DOM changes to verify success
- Decides the next move:
- Return results to the user if the task is done
- Move on to the next step in the plan
- Ask the Planner Agent to revise the plan if something went wrong
This loop continues until the task completes or a terminal condition is hit.
Agentic Browser uses uv to manage the Python virtual environment and dependencies.
-
macOS/Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh -
Windows:
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"You can also install uv via pip.
git clone https://github.com/TheAgenticAI/TheAgenticBrowser
uv venv --python=3.11
source .venv/bin/activate
# On Windows: .venv\Scripts\activate
uv pip install -r requirements.txt
playwright install
To use your local Chrome browser instead of Playwright, open chrome://version/ in Chrome, copy the path to your profile, and set BROWSER_STORAGE_DIR to that path in .env.
Copy the example env file and fill in your values:
cp .env.example .env
Edit .env and set the following:
# AGENTIC_BROWSER Configuration
AGENTIC_BROWSER_TEXT_MODEL=<text model name eg. "gpt-4o">
AGENTIC_BROWSER_TEXT_API_KEY=<your text model API key>
AGENTIC_BROWSER_TEXT_BASE_URL=<text model base url eg. "https://api.openai.com/v1">
# Screenshot Analysis Configuration
AGENTIC_BROWSER_SS_ENABLED=<true/false>
AGENTIC_BROWSER_SS_MODEL=<screenshot model name eg. "gpt-4o">
AGENTIC_BROWSER_SS_API_KEY=<your screenshot model API key>
AGENTIC_BROWSER_SS_BASE_URL=<screenshot model base url eg. "https://api.openai.com/v1">
# Logging
LOGFIRE_TOKEN=<your logfire write token>
# Google Search Configuration
GOOGLE_API_KEY=<your Custom Search json api>
GOOGLE_CX=<your google custom search engine id>
# Browser Configuration
BROWSER_STORAGE_DIR=<path to browser storage directory eg. "./browser_storage">
STEEL_DEV_API_KEY=<Optional: Enable remote browser via Steel Dev CDP, (Only useful when launched as an API, see Step 7>
Run directly from main.py or spin up an API server:
-
Direct:
python3 -m core.main
-
API:
uvicorn core.server.api_routes:app --loop asyncio
Example request:
POST http://127.0.0.1:8000/execute_task { "command": "Give me the price of RTX 3060ti on amazon.in and give me the latest delivery date." }
docker build -t agentic_browser .
docker run -it --net=host --env-file .env agentic_browserdocker build -t agentic_browser .
docker run -it -p 8000:8000 --env-file .env agentic_browser