An intelligent QA automation system for Xero Payroll using Skyvern Open Source (vision/LLM + DOM) to mimic human QA behavior. The agent can read documentation, search online resources, plan test steps, and execute them autonomously using computer vision and DOM automation.
Note: This project uses Skyvern open source which runs locally. See SKYVERN.md for setup instructions.
- Intelligent Test Planning: Converts high-level test descriptions into detailed, executable steps using LLM
- Documentation RAG System: Retrieves relevant documentation from local sources and online Xero docs
- Computer Vision + DOM Automation: Uses Skyvern for reliable web automation
- Network Monitoring: Tracks network activity to ensure stable test execution
- Visual Stability Checks: Waits for page stability before interactions
- Self-Healing: Automatic retry logic with intelligent error handling
- Comprehensive Reporting: Detailed HTML and JSON reports with screenshots
- Cross-Platform: Works on Windows, Linux, and macOS
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Test Case Definition β
β (High-level steps like "Create org", "Add employee") β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Documentation Retrieval (RAG) β
β - Local docs (Vector DB) β
β - Xero online docs (Web Search) β
β - Execution history (Learning) β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Step Planning (LLM) β
β Converts: "Create org" β Detailed Skyvern steps β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Skyvern Execution Engine β
β - Computer Vision + DOM β
β - Network monitoring β
β - Visual stability checks β
β - Screenshot capture β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Result Validation β
β Compare actual vs expected results β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Test Report Generation β
β HTML reports, JSON data, screenshots β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Option A: Docker (Recommended)
- Docker Desktop (macOS/Windows) or Docker Engine (Linux)
- Docker Compose v2.0+
- 4GB+ RAM, 10GB+ disk space
Option B: Manual Setup
- Python 3.10 or higher
- Git
- Skyvern Open Source running locally (see SKYVERN.md)
- Docker and Docker Compose (recommended), OR
- PostgreSQL and Redis (for manual setup)
# Clone repository
git clone <repository-url>
cd agent-carter
# Configure environment
cp .env.example .env
# Edit .env with your API keys
# Start all services (includes Skyvern)
docker-compose up -d
# Run POC test
docker-compose exec xero-qa-agent python examples/poc_test.py
# View results in ./reports/ and ./screenshots/For detailed Docker instructions, see DOCKER.md
# Clone repository
git clone <repository-url>
cd agent-carter
# Run setup script
chmod +x setup.sh
./setup.sh
# Activate virtual environment
source venv/bin/activateREM Clone repository
git clone <repository-url>
cd agent-carter
REM Run setup script
setup.bat
REM Activate virtual environment
venv\Scripts\activate.bat# 1. Setup Skyvern (run in separate terminal/directory)
# See SKYVERN_SETUP.md for detailed instructions
git clone https://github.com/Skyvern-AI/skyvern.git
cd skyvern
docker-compose up -d # Starts Skyvern at http://localhost:8000
# 2. Setup Xero QA Agent (in this directory)
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate.bat
# Install dependencies
pip install -r requirements.txt
# Install Playwright browsers
playwright install chromium
# Copy environment file
cp .env.example .env
# Edit .env and add your API keysEdit .env file with your credentials:
# Required: Skyvern Open Source (must be running locally)
SKYVERN_BASE_URL=http://localhost:8000
SKYVERN_API_TOKEN= # Leave empty for local development
# Required: LLM Provider (choose one)
ANTHROPIC_API_KEY=your_anthropic_key_here
# OR
OPENAI_API_KEY=your_openai_key_here
# Required: Xero Test Credentials
[email protected]
XERO_TEST_PASSWORD=your_test_password
# Optional: Customize settings
LLM_PROVIDER=anthropic # or "openai"
HEADLESS_MODE=false # Set true for headless browser
LOG_LEVEL=INFOImportant: Ensure Skyvern is running at
http://localhost:8000before running tests. See SKYVERN.md for setup.
Create Xero documentation files in the xero_docs/ directory:
xero_docs/
βββ payroll/
β βββ organisation_setup.md
β βββ employee_management.md
β βββ leave_types.md
β βββ payslip_generation.md
βββ api/
β βββ payroll_api.md
βββ troubleshooting/
βββ common_issues.md
python examples/index_docs.pyThe agent will also automatically search Xero online documentation when needed.
# Run POC test
docker-compose exec xero-qa-agent python examples/poc_test.py
# Run simple test
docker-compose exec xero-qa-agent python examples/simple_test.py
# Run full test suite
docker-compose exec xero-qa-agent python examples/full_test_suite.py
# Index documentation
docker-compose exec xero-qa-agent python examples/index_docs.py
# View logs
docker-compose logs -f xero-qa-agent
# Stop services
docker-compose downThe POC test demonstrates the core functionality:
- Create an AU organisation
- Add an employee with annual leave (paid) and carer leave (unpaid)
- Generate payslip and check number
python examples/poc_test.pypython examples/simple_test.pypython examples/full_test_suite.pyimport asyncio
from xero_qa_agent.agent import XeroQAAgent
from xero_qa_agent.core.models import TestCase
async def my_test():
# Define test case
test = TestCase(
id="TC_CUSTOM_001",
description="My custom test",
steps=[
"Navigate to Xero",
"Create an AU organisation",
"Add an employee"
],
expected_results={
"organisation_created": True,
"employee_created": True
}
)
# Initialize agent
agent = XeroQAAgent()
# Execute test
report = await agent.execute_test_case(test)
# Check results
print(f"Status: {report.overall_status.value}")
print(f"Steps: {report.passed_steps}/{report.total_steps} passed")
asyncio.run(my_test())After running tests, check the generated reports:
- HTML Reports:
./reports/*.html- Detailed visual reports with screenshots - JSON Reports:
./reports/*.json- Machine-readable test data - Screenshots:
./screenshots/*.png- Step-by-step screenshots - Logs:
./logs/xero_qa_agent.log- Detailed execution logs
The project includes example test cases demonstrating the agent's capabilities:
# Run POC test case
python examples/poc_test.py
# Run simple test
python examples/simple_test.py
# Run full test suite
python examples/full_test_suite.pyNote: Unit tests coming soon. Current examples serve as integration tests.
This README provides an overview. For detailed guides, see:
- GETTING_STARTED.md - Quick start guide (5 minutes)
- DOCKER.md - Docker setup, commands, and troubleshooting
- SKYVERN.md - Skyvern integration for vision-based automation
- LLM_PROVIDERS.md - LLM provider options and configuration
- ARCHITECTURE.md - Technical architecture and design
Main orchestrator class for test execution.
from xero_qa_agent.agent import XeroQAAgent
agent = XeroQAAgent(
skyvern_api_key="...", # Optional, uses .env if not provided
llm_provider="anthropic", # "anthropic" or "openai"
llm_api_key="..." # Optional, uses .env if not provided
)
# Execute test case
report = await agent.execute_test_case(test_case)
# Index documentation
count = await agent.index_documentation("./xero_docs")
# Get documentation stats
stats = agent.get_documentation_stats()Define test cases with high-level steps.
from xero_qa_agent.core.models import TestCase
test = TestCase(
id="TC_001",
description="Test description",
steps=[
"High-level step 1",
"High-level step 2"
],
expected_results={
"field_name": "expected_value",
"numeric_field": 25.50,
"pattern_field": {"pattern": r"PS-\d{4}"}
},
tags=["tag1", "tag2"],
priority="high"
)Contains test execution results.
# Access report data
print(report.overall_status) # PASSED, FAILED, ERROR
print(report.total_execution_time) # Total time in seconds
print(report.passed_steps) # Number of passed steps
print(report.failed_steps) # Number of failed steps
print(report.validation_results) # List of validation results
print(report.screenshots) # List of screenshot pathsagent-carter/
βββ xero_qa_agent/ # Main package
β βββ core/ # Core models and types
β β βββ config.py # Configuration management
β β βββ models.py # Data models
β β βββ types.py # Type definitions
β β βββ documentation.py # RAG system
β βββ planners/ # Step planning
β β βββ step_planner.py # LLM-based planner
β βββ executors/ # Execution engines
β β βββ skyvern_executor.py # Skyvern integration
β βββ validators/ # Result validation
β β βββ result_validator.py
β βββ reporters/ # Report generation
β β βββ report_generator.py
β βββ utils/ # Utilities
β β βββ logger.py # Logging setup
β β βββ helpers.py # Helper functions
β βββ agent.py # Main orchestrator
βββ examples/ # Example scripts
β βββ poc_test.py # POC test case
β βββ simple_test.py # Simple example
β βββ full_test_suite.py # Full test suite
β βββ index_docs.py # Documentation indexer
βββ tests/ # Test files
βββ xero_docs/ # Xero documentation
βββ requirements.txt # Python dependencies
βββ pyproject.toml # Package configuration
βββ setup.sh # Linux/macOS setup script
βββ setup.bat # Windows setup script
βββ .env.example # Environment template
βββ README.md # This file
Configure network idle timeout and stability checks:
NETWORK_IDLE_TIMEOUT=5000 # ms to wait for network idle
VISUAL_STABILITY_CHECKS=3 # Number of stability checks
VISUAL_STABILITY_DELAY=500 # ms between checksHEADLESS_MODE=false # Run in headless mode
BROWSER_TIMEOUT=60000 # Browser timeout (ms)
SLOW_MO=100 # Slow down browser actions (ms)MAX_RETRIES=3 # Maximum retry attempts
RETRY_DELAY_SECONDS=2 # Delay between retriesIssue: ModuleNotFoundError: No module named 'xero_qa_agent'
- Solution: Run
pip install -e .in the project root
Issue: Playwright browsers not found
- Solution: Run
playwright install chromium
Issue: ANTHROPIC_API_KEY is required
- Solution: Edit
.envfile and add your API key
Issue: Skyvern connection failed
- Solution: Ensure Skyvern is running:
docker-compose psin Skyvern directory - Solution: Check Skyvern logs:
docker-compose logs skyvern - Solution: Restart Skyvern:
docker-compose restart - Solution: See SKYVERN.md for detailed setup
Issue: Tests fall back to Playwright
- Solution: This is normal if Skyvern is not running. The agent automatically uses Playwright as fallback
- Note: For full CV+DOM capabilities, ensure Skyvern is running
Issue: Tests fail with network timeout
- Solution: Increase
NETWORK_IDLE_TIMEOUTin.env
Issue: No documentation found
- Solution: Run
python examples/index_docs.pyto index docs
Enable verbose logging:
LOG_LEVEL=DEBUG- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Skyvern - Computer vision + DOM automation
- Anthropic Claude - LLM for intelligent planning
- ChromaDB - Vector database for RAG
- Playwright - Browser automation
For issues and questions:
- Create an issue on GitHub
- Check the troubleshooting section
- Review example scripts in
examples/
Note: This is a POC (Proof of Concept) implementation. For production use, additional error handling, security measures, and comprehensive testing are recommended.