Skip to content

v1.0.0

Choose a tag to compare

@sanju-presidio sanju-presidio released this 03 Mar 08:15
7359d94

Release Notes

Built-in support for leading vision-language models:

  • Claude: Anthropic's advanced vision and reasoning model
  • OpenAI: GPT-4o with visual understanding capabilities
  • Gemini: Google's multimodal AI for computer interaction
  • OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent

AI-Powered Computer Control

  • Intelligent element detection and navigation
  • Automated verification and validation
  • Comprehensive test documentation with automated screenshot capture for each step
  • Integrated test case export with visual step-by-step documentation