Skip to content

shishir/agent-carter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

16 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Payroll QA Agent

An intelligent QA automation system for Xero Payroll using Skyvern Open Source (vision/LLM + DOM) to mimic human QA behavior. The agent can read documentation, search online resources, plan test steps, and execute them autonomously using computer vision and DOM automation.

Note: This project uses Skyvern open source which runs locally. See SKYVERN.md for setup instructions.

🎯 Features

  • Intelligent Test Planning: Converts high-level test descriptions into detailed, executable steps using LLM
  • Documentation RAG System: Retrieves relevant documentation from local sources and online Xero docs
  • Computer Vision + DOM Automation: Uses Skyvern for reliable web automation
  • Network Monitoring: Tracks network activity to ensure stable test execution
  • Visual Stability Checks: Waits for page stability before interactions
  • Self-Healing: Automatic retry logic with intelligent error handling
  • Comprehensive Reporting: Detailed HTML and JSON reports with screenshots
  • Cross-Platform: Works on Windows, Linux, and macOS

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Test Case Definition                        β”‚
β”‚  (High-level steps like "Create org", "Add employee")  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          Documentation Retrieval (RAG)                   β”‚
β”‚  - Local docs (Vector DB)                               β”‚
β”‚  - Xero online docs (Web Search)                        β”‚
β”‚  - Execution history (Learning)                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚             Step Planning (LLM)                          β”‚
β”‚  Converts: "Create org" β†’ Detailed Skyvern steps        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          Skyvern Execution Engine                        β”‚
β”‚  - Computer Vision + DOM                                β”‚
β”‚  - Network monitoring                                   β”‚
β”‚  - Visual stability checks                              β”‚
β”‚  - Screenshot capture                                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           Result Validation                              β”‚
β”‚  Compare actual vs expected results                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          Test Report Generation                          β”‚
β”‚  HTML reports, JSON data, screenshots                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“¦ Installation

Prerequisites

Option A: Docker (Recommended)

  • Docker Desktop (macOS/Windows) or Docker Engine (Linux)
  • Docker Compose v2.0+
  • 4GB+ RAM, 10GB+ disk space

Option B: Manual Setup

  • Python 3.10 or higher
  • Git
  • Skyvern Open Source running locally (see SKYVERN.md)
    • Docker and Docker Compose (recommended), OR
    • PostgreSQL and Redis (for manual setup)

Quick Start with Docker (Recommended)

# Clone repository
git clone <repository-url>
cd agent-carter

# Configure environment
cp .env.example .env
# Edit .env with your API keys

# Start all services (includes Skyvern)
docker-compose up -d

# Run POC test
docker-compose exec xero-qa-agent python examples/poc_test.py

# View results in ./reports/ and ./screenshots/

For detailed Docker instructions, see DOCKER.md

Quick Start - Manual Setup (Linux/macOS)

# Clone repository
git clone <repository-url>
cd agent-carter

# Run setup script
chmod +x setup.sh
./setup.sh

# Activate virtual environment
source venv/bin/activate

Quick Start - Manual Setup (Windows)

REM Clone repository
git clone <repository-url>
cd agent-carter

REM Run setup script
setup.bat

REM Activate virtual environment
venv\Scripts\activate.bat

Manual Installation

# 1. Setup Skyvern (run in separate terminal/directory)
# See SKYVERN_SETUP.md for detailed instructions
git clone https://github.com/Skyvern-AI/skyvern.git
cd skyvern
docker-compose up -d  # Starts Skyvern at http://localhost:8000

# 2. Setup Xero QA Agent (in this directory)
# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate.bat

# Install dependencies
pip install -r requirements.txt

# Install Playwright browsers
playwright install chromium

# Copy environment file
cp .env.example .env

# Edit .env and add your API keys

βš™οΈ Configuration

Edit .env file with your credentials:

# Required: Skyvern Open Source (must be running locally)
SKYVERN_BASE_URL=http://localhost:8000
SKYVERN_API_TOKEN=  # Leave empty for local development

# Required: LLM Provider (choose one)
ANTHROPIC_API_KEY=your_anthropic_key_here
# OR
OPENAI_API_KEY=your_openai_key_here

# Required: Xero Test Credentials
[email protected]
XERO_TEST_PASSWORD=your_test_password

# Optional: Customize settings
LLM_PROVIDER=anthropic  # or "openai"
HEADLESS_MODE=false     # Set true for headless browser
LOG_LEVEL=INFO

Important: Ensure Skyvern is running at http://localhost:8000 before running tests. See SKYVERN.md for setup.

πŸ“š Documentation Setup

Add Local Documentation

Create Xero documentation files in the xero_docs/ directory:

xero_docs/
β”œβ”€β”€ payroll/
β”‚   β”œβ”€β”€ organisation_setup.md
β”‚   β”œβ”€β”€ employee_management.md
β”‚   β”œβ”€β”€ leave_types.md
β”‚   └── payslip_generation.md
β”œβ”€β”€ api/
β”‚   └── payroll_api.md
└── troubleshooting/
    └── common_issues.md

Index Documentation

python examples/index_docs.py

The agent will also automatically search Xero online documentation when needed.

πŸš€ Usage

With Docker

# Run POC test
docker-compose exec xero-qa-agent python examples/poc_test.py

# Run simple test
docker-compose exec xero-qa-agent python examples/simple_test.py

# Run full test suite
docker-compose exec xero-qa-agent python examples/full_test_suite.py

# Index documentation
docker-compose exec xero-qa-agent python examples/index_docs.py

# View logs
docker-compose logs -f xero-qa-agent

# Stop services
docker-compose down

Without Docker (Manual Setup)

Run POC Test Case

The POC test demonstrates the core functionality:

  • Create an AU organisation
  • Add an employee with annual leave (paid) and carer leave (unpaid)
  • Generate payslip and check number
python examples/poc_test.py

Run Simple Test

python examples/simple_test.py

Run Full Test Suite

python examples/full_test_suite.py

Create Custom Test

import asyncio
from xero_qa_agent.agent import XeroQAAgent
from xero_qa_agent.core.models import TestCase

async def my_test():
    # Define test case
    test = TestCase(
        id="TC_CUSTOM_001",
        description="My custom test",
        steps=[
            "Navigate to Xero",
            "Create an AU organisation",
            "Add an employee"
        ],
        expected_results={
            "organisation_created": True,
            "employee_created": True
        }
    )

    # Initialize agent
    agent = XeroQAAgent()

    # Execute test
    report = await agent.execute_test_case(test)

    # Check results
    print(f"Status: {report.overall_status.value}")
    print(f"Steps: {report.passed_steps}/{report.total_steps} passed")

asyncio.run(my_test())

πŸ“Š Test Reports

After running tests, check the generated reports:

  • HTML Reports: ./reports/*.html - Detailed visual reports with screenshots
  • JSON Reports: ./reports/*.json - Machine-readable test data
  • Screenshots: ./screenshots/*.png - Step-by-step screenshots
  • Logs: ./logs/xero_qa_agent.log - Detailed execution logs

πŸ§ͺ Testing

The project includes example test cases demonstrating the agent's capabilities:

# Run POC test case
python examples/poc_test.py

# Run simple test
python examples/simple_test.py

# Run full test suite
python examples/full_test_suite.py

Note: Unit tests coming soon. Current examples serve as integration tests.

πŸ“š Documentation

This README provides an overview. For detailed guides, see:

πŸ“– API Documentation

XeroQAAgent

Main orchestrator class for test execution.

from xero_qa_agent.agent import XeroQAAgent

agent = XeroQAAgent(
    skyvern_api_key="...",  # Optional, uses .env if not provided
    llm_provider="anthropic",  # "anthropic" or "openai"
    llm_api_key="..."  # Optional, uses .env if not provided
)

# Execute test case
report = await agent.execute_test_case(test_case)

# Index documentation
count = await agent.index_documentation("./xero_docs")

# Get documentation stats
stats = agent.get_documentation_stats()

TestCase Model

Define test cases with high-level steps.

from xero_qa_agent.core.models import TestCase

test = TestCase(
    id="TC_001",
    description="Test description",
    steps=[
        "High-level step 1",
        "High-level step 2"
    ],
    expected_results={
        "field_name": "expected_value",
        "numeric_field": 25.50,
        "pattern_field": {"pattern": r"PS-\d{4}"}
    },
    tags=["tag1", "tag2"],
    priority="high"
)

TestReport Model

Contains test execution results.

# Access report data
print(report.overall_status)  # PASSED, FAILED, ERROR
print(report.total_execution_time)  # Total time in seconds
print(report.passed_steps)  # Number of passed steps
print(report.failed_steps)  # Number of failed steps
print(report.validation_results)  # List of validation results
print(report.screenshots)  # List of screenshot paths

πŸ—οΈ Project Structure

agent-carter/
β”œβ”€β”€ xero_qa_agent/          # Main package
β”‚   β”œβ”€β”€ core/               # Core models and types
β”‚   β”‚   β”œβ”€β”€ config.py       # Configuration management
β”‚   β”‚   β”œβ”€β”€ models.py       # Data models
β”‚   β”‚   β”œβ”€β”€ types.py        # Type definitions
β”‚   β”‚   └── documentation.py # RAG system
β”‚   β”œβ”€β”€ planners/           # Step planning
β”‚   β”‚   └── step_planner.py # LLM-based planner
β”‚   β”œβ”€β”€ executors/          # Execution engines
β”‚   β”‚   └── skyvern_executor.py # Skyvern integration
β”‚   β”œβ”€β”€ validators/         # Result validation
β”‚   β”‚   └── result_validator.py
β”‚   β”œβ”€β”€ reporters/          # Report generation
β”‚   β”‚   └── report_generator.py
β”‚   β”œβ”€β”€ utils/              # Utilities
β”‚   β”‚   β”œβ”€β”€ logger.py       # Logging setup
β”‚   β”‚   └── helpers.py      # Helper functions
β”‚   └── agent.py            # Main orchestrator
β”œβ”€β”€ examples/               # Example scripts
β”‚   β”œβ”€β”€ poc_test.py         # POC test case
β”‚   β”œβ”€β”€ simple_test.py      # Simple example
β”‚   β”œβ”€β”€ full_test_suite.py  # Full test suite
β”‚   └── index_docs.py       # Documentation indexer
β”œβ”€β”€ tests/                  # Test files
β”œβ”€β”€ xero_docs/              # Xero documentation
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ pyproject.toml          # Package configuration
β”œβ”€β”€ setup.sh                # Linux/macOS setup script
β”œβ”€β”€ setup.bat               # Windows setup script
β”œβ”€β”€ .env.example            # Environment template
└── README.md               # This file

πŸ”§ Advanced Configuration

Network Monitoring

Configure network idle timeout and stability checks:

NETWORK_IDLE_TIMEOUT=5000      # ms to wait for network idle
VISUAL_STABILITY_CHECKS=3      # Number of stability checks
VISUAL_STABILITY_DELAY=500     # ms between checks

Browser Settings

HEADLESS_MODE=false            # Run in headless mode
BROWSER_TIMEOUT=60000          # Browser timeout (ms)
SLOW_MO=100                    # Slow down browser actions (ms)

Retry Configuration

MAX_RETRIES=3                  # Maximum retry attempts
RETRY_DELAY_SECONDS=2          # Delay between retries

πŸ› Troubleshooting

Common Issues

Issue: ModuleNotFoundError: No module named 'xero_qa_agent'

  • Solution: Run pip install -e . in the project root

Issue: Playwright browsers not found

  • Solution: Run playwright install chromium

Issue: ANTHROPIC_API_KEY is required

  • Solution: Edit .env file and add your API key

Issue: Skyvern connection failed

  • Solution: Ensure Skyvern is running: docker-compose ps in Skyvern directory
  • Solution: Check Skyvern logs: docker-compose logs skyvern
  • Solution: Restart Skyvern: docker-compose restart
  • Solution: See SKYVERN.md for detailed setup

Issue: Tests fall back to Playwright

  • Solution: This is normal if Skyvern is not running. The agent automatically uses Playwright as fallback
  • Note: For full CV+DOM capabilities, ensure Skyvern is running

Issue: Tests fail with network timeout

  • Solution: Increase NETWORK_IDLE_TIMEOUT in .env

Issue: No documentation found

  • Solution: Run python examples/index_docs.py to index docs

Debug Mode

Enable verbose logging:

LOG_LEVEL=DEBUG

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“ž Support

For issues and questions:

  • Create an issue on GitHub
  • Check the troubleshooting section
  • Review example scripts in examples/

Note: This is a POC (Proof of Concept) implementation. For production use, additional error handling, security measures, and comprehensive testing are recommended.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •