A browser automation agent for web scraping, text extraction, and AI-powered content summarization built with Microsoft Agent Framework patterns and Azure AI Foundry SDK integration.
Based on:
J-browser-agents/
β
βββ Core/ # Core framework modules
β βββ agent_framework.py # Microsoft Agent Framework base classes
β βββ azure_ai_client.py # Azure AI Foundry SDK integration
β βββ browser_automation.py # Playwright browser control
β βββ text_extractor.py # HTML parsing & extraction
β βββ text_summarizer.py # AI-powered summarization
β βββ browser_agent.py # Main orchestrator (extends OpenAIAgent)
β
βββ Demos/ # Example scripts
β βββ demo.py # Interactive demo
β βββ demo_mslearn.py # Microsoft Learn scraper
β
βββ Tests/ # Testing & verification
β βββ test_quick.py # Quick functionality test
β βββ verify_setup.py # Comprehensive verification
β
βββ Scripts/ # Utility scripts
β βββ install_dependencies.bat # One-click installation
β βββ test_framework.bat # Test launcher
β βββ run_demo.bat # Demo launcher
β βββ run_test.bat # Quick test runner
β
βββ Config/ # Configuration
β βββ requirements.txt # Python dependencies
β
βββ .env.example # Environment configuration template
βββ README.md # This file
βββ LICENSE # License information
- π Browser Automation - Automated web browsing using Playwright
- π Text Extraction - Clean HTML parsing and structured content extraction
- π€ AI Summarization - Intelligent summarization using OpenAI or Azure OpenAI
- π― Microsoft Agent Framework - Tool-based agent architecture for extensibility
- βοΈ Azure AI Foundry SDK - Unified access to Azure AI services
- πΎ JSON Export - Save extracted content for later analysis
- π Q&A Support - Ask questions about extracted content
- π§ Multi-Agent Orchestration - Coordinate multiple agents for complex workflows
Automated (Windows):
cd Scripts
install_dependencies.batManual installation:
pip install -r Config/requirements.txt
python -m playwright install chromiumThe framework uses these main packages (see Config/requirements.txt):
- playwright - Browser automation
- beautifulsoup4 + lxml - HTML parsing
- openai - AI summarization (OpenAI or Azure OpenAI)
- azure-ai-projects - Azure AI Foundry SDK
- azure-identity - Azure authentication
- langchain + langchain-openai - LangChain integration
- python-dotenv - Environment configuration
cd Scripts
test_framework.bat
# Choose option 1 for quick testcd Scripts
run_demo.bat
# Choose demo optionfrom Core.browser_agent import BrowserAgent
with BrowserAgent(headless=True) as agent:
# Scrape and extract content
content = agent.scrape_and_extract("https://example.com")
# Access extracted data
print(f"Title: {content['title']}")
print(f"Word count: {content['word_count']}")
print(f"Headings: {len(content['headings'])}")
# Save to JSON
agent.save_extracted_content("output.json")from Core.browser_agent import BrowserAgent
# Set environment: set OPENAI_API_KEY=sk-...
with BrowserAgent(headless=True) as agent:
# Scrape a page
url = "https://learn.microsoft.com/en-us/azure/ai-foundry/agents/overview"
agent.scrape_and_extract(url)
# Generate summary
summary = agent.summarize_current_page(
style="concise",
max_length=200
)
# Extract key points
points = agent.get_key_points(num_points=5)
# Ask questions about the content
answer = agent.ask_question("What is this page about?")from Core.browser_agent import BrowserAgent
# Use Azure AI Foundry project endpoint
azure_endpoint = "https://<resource>.services.ai.azure.com/api/projects/<project>"
with BrowserAgent(
headless=True,
azure_endpoint=azure_endpoint,
model="gpt-4o"
) as agent:
agent.scrape_and_extract("https://example.com")
summary = agent.summarize_current_page()from Core.agent_framework import OpenAIAgent, AgentOrchestrator
from Core.azure_ai_client import AzureAIClient
# Create an Azure AI client
azure_client = AzureAIClient(
endpoint="https://<resource>.services.ai.azure.com/api/projects/<project>"
)
# Create a custom agent
agent = OpenAIAgent(
name="MyAgent",
system_prompt="You are a helpful assistant.",
model="gpt-4o",
azure_client=azure_client
)
# Register custom tools
agent.register_function(
name="my_tool",
description="Does something useful",
function=lambda x: f"Processed: {x}",
parameters={"type": "object", "properties": {"x": {"type": "string"}}}
)
# Invoke the agent
response = agent.invoke("Hello, how can you help me?")
print(response.content)from Core.agent_framework import AgentOrchestrator, OpenAIAgent
# Create orchestrator
orchestrator = AgentOrchestrator(name="MainOrchestrator")
# Register multiple agents
research_agent = OpenAIAgent(name="ResearchAgent", system_prompt="You research topics.")
summary_agent = OpenAIAgent(name="SummaryAgent", system_prompt="You summarize content.")
orchestrator.register_agent(research_agent)
orchestrator.register_agent(summary_agent)
# Invoke specific agents
result = orchestrator.invoke_agent("ResearchAgent", "Find info about AI agents")βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Agent Orchestrator β
β (Core/agent_framework.py) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββΌββββββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Browser Agent β β Custom Agent β β Other Agents β
β (browser_agent) β β (OpenAI) β β ... β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βββ Tools (scrape_url, summarize, etc.)
β
βββββββ΄ββββββ¬ββββββββββββββ
βΌ βΌ βΌ
βββββββββββ βββββββββββ ββββββββββββββββ
β Browser β β Text β β Text β
β Auto β β Extract β β Summarize β
β β β β β β
βPlaywrightβ βBeautifulβ β OpenAI/Azure β
β β β Soup β β AI Foundry β
βββββββββββ βββββββββββ ββββββββββββββββ
β
βββββββββββ΄ββββββββββ
βΌ βΌ
βββββββββββββββ βββββββββββββββ
β OpenAI β β Azure AI β
β Direct β β Foundry β
βββββββββββββββ βββββββββββββββ
- BaseAgent - Abstract base class for all agents
- OpenAIAgent - Agent implementation using OpenAI/Azure OpenAI
- AgentTool - Tool definition with OpenAI function calling format
- AgentOrchestrator - Multi-agent coordination
- AzureAIClient - Azure AI Projects SDK wrapper
- AzureOpenAIDirectClient - Direct Azure OpenAI endpoint access
- Supports DefaultAzureCredential authentication
- Unified project endpoint for Foundry services
- Launch browsers (headless/headed mode)
- Navigate with smart wait strategies
- Capture screenshots
- Extract HTML and text content
- Parse HTML with BeautifulSoup
- Extract titles, headings, paragraphs
- Parse links and code blocks
- Clean and normalize text
- Calculate statistics
- Generate AI-powered summaries using OpenAI or Azure OpenAI
- Supports Azure AI Foundry project endpoint
- Extract key points from content
- Answer questions using context
- Multiple output styles (concise, detailed, bullet points)
- Extends OpenAIAgent from agent framework
- Registers browser tools (scrape, summarize, key points, Q&A)
- Supports Azure AI Foundry and direct OpenAI
- Context manager support
{
"title": "Page Title",
"headings": [
{"level": 1, "text": "Main Heading"},
{"level": 2, "text": "Subheading"}
],
"paragraphs": [
"First paragraph text...",
"Second paragraph text..."
],
"main_content": "Full cleaned text content of the page...",
"links": [
{"text": "Link text", "url": "https://example.com"}
],
"code_blocks": [
"code snippet 1",
"code snippet 2"
],
"word_count": 1234
}| Style | Description | Use Case |
|---|---|---|
concise |
Brief overview | Quick understanding |
detailed |
Comprehensive with key points | In-depth analysis |
bullet_points |
Key takeaways as bullets | Executive summary |
Copy .env.example to .env and configure:
# Option 1: Azure AI Foundry (Recommended for enterprise)
AZURE_AI_PROJECT_ENDPOINT=https://<resource>.services.ai.azure.com/api/projects/<project>
# Option 2: Direct Azure OpenAI
AZURE_OPENAI_ENDPOINT=https://<resource>.openai.azure.com/openai/v1
AZURE_OPENAI_API_KEY=your-key-here # Optional if using Entra ID
# Option 3: Direct OpenAI
OPENAI_API_KEY=sk-your-api-key-here
# Model configuration
DEFAULT_MODEL=gpt-4o-mini| Method | Use Case | Configuration |
|---|---|---|
| Azure AI Foundry + Entra ID | Enterprise (Recommended) | Set AZURE_AI_PROJECT_ENDPOINT, use DefaultAzureCredential |
| Azure OpenAI + API Key | Azure with key auth | Set AZURE_OPENAI_ENDPOINT + AZURE_OPENAI_API_KEY |
| Direct OpenAI | Personal/Development | Set OPENAI_API_KEY |
# Option 1: Azure AI Foundry
agent = BrowserAgent(
headless=True,
azure_endpoint="https://<resource>.services.ai.azure.com/api/projects/<project>",
model="gpt-4o"
)
# Option 2: Direct OpenAI
agent = BrowserAgent(
headless=True,
api_key="sk-your-api-key-here",
model="gpt-4o-mini"
)
# Option 3: Environment variables
import os
os.environ["OPENAI_API_KEY"] = "sk-your-api-key-here"
agent = BrowserAgent(headless=True)- Documentation Scraping - Extract and summarize technical docs
- Content Analysis - Analyze web pages for specific information
- Research Automation - Gather info from multiple sources
- Knowledge Extraction - Build structured data from web content
- Competitive Analysis - Monitor and analyze competitor content
- Tutorial Aggregation - Collect and summarize learning materials
cd Tests
python test_quick.pycd Tests
python verify_setup.pyWhat gets tested:
- β All dependencies installed
- β Playwright browsers available
- β Core modules functional
- β Browser automation working
- β Text extraction accurate
- β AI features (if API key set)
Import Errors
# Use module-style imports from project root
from Core.browser_agent import BrowserAgent"Playwright not found"
python -m playwright install chromium"No module named..."
pip install -r Config/requirements.txt"API key not configured"
- Summarization is optional
- Framework works without API key for text extraction
- Only needed for AI-powered features
Path Issues
- Run scripts from project root directory
- Use module-style imports:
from Core.browser_agent import BrowserAgent
Import errors
# Reinstall all dependencies
pip install --force-reinstall -r Config/requirements.txtBrowser automation can pose security risks. Best practices:
- β Run in isolated/sandboxed environments
- β Use headless mode for production
- β Set appropriate timeouts
- β Implement rate limiting
- β Review extracted content
- β Don't access sensitive sites
- β Don't store credentials in code
- β Don't bypass authentication
class BrowserAgent(headless=True, api_key=None, azure_endpoint=None, model="gpt-4o")Parameters:
headless(bool): Run browser in headless mode (default: True)api_key(str, optional): OpenAI API key for summarizationazure_endpoint(str, optional): Azure AI Foundry project endpointmodel(str): Model to use (default: "gpt-4o")
Methods:
scrape_and_extract(url)β Dict - Scrape URL and return structured contentsummarize_current_page(style, max_length)β str - Summarize current pageget_key_points(num_points)β List[str] - Extract key pointsask_question(question)β str - Answer question about contentsave_extracted_content(filepath)- Save to JSON fileinvoke(input_text)β AgentResponse - Invoke agent with natural languageregister_function(name, description, function, parameters)- Register custom toolstart()- Start the browserclose()- Close the browser
class OpenAIAgent(name, system_prompt, model, api_key=None, azure_client=None)Methods:
invoke(input_text)β AgentResponse - Process input and return responseregister_tool(tool)- Register an AgentToolregister_function(name, description, function, parameters)- Register function as toolexecute_tool(tool_name, arguments)- Execute a registered tooladd_message(role, content)- Add message to conversation historyclear_history()- Clear conversation history
class AzureAIClient(endpoint=None, credential=None, use_azure=True)Methods:
get_openai_client(api_version)- Get OpenAI-compatible clientget_chat_completion(messages, model, temperature, max_tokens)β stris_available()β bool - Check if Azure client is configured
class TextSummarizer(api_key=None, model="gpt-4o-mini", azure_client=None, azure_endpoint=None)Parameters:
api_key(str, optional): OpenAI API key (or setOPENAI_API_KEYenv var)model(str): Model to use for summarization (default:gpt-4o-mini)azure_client(AzureAIClient, optional): Azure AI client for Foundry accessazure_endpoint(str, optional): Azure AI Foundry project endpoint
with BrowserAgent(headless=True) as agent:
content = agent.scrape_and_extract(url)
# Browser automatically closedContributions welcome! Areas for improvement:
- Support for dynamic content (JavaScript-heavy sites)
- Caching and rate limiting
- Additional extraction patterns
- MCP (Model Context Protocol) tool integration
- Microsoft 365 Agents SDK channel support (Teams, Copilot)
- Multi-page navigation
- PDF export
See LICENSE file.
- Microsoft Agent Framework & M365 Agents SDK
- Azure AI Foundry SDK Overview
- Microsoft Azure AI Foundry Browser Automation
Made with β€οΈ for intelligent web automation