Skip to content
This repository was archived by the owner on Nov 29, 2025. It is now read-only.

Releases: westonbrown/Cyber-AutoAgent

Release v0.1.3: React Terminal UI, Evaluation System, Architecture Refactor

06 Oct 21:23

Choose a tag to compare

Major release introducing React-based terminal interface, automated evaluation system, and comprehensive architecture refactoring based on operational experience and community feedback. See detailed analysis at #41

Key Changes

React Terminal Interface: Complete UI overhaul with interactive React/Ink terminal as default interface, providing guided setup, real-time operation monitoring, and enhanced user experience across all deployment modes

Evaluation & Observability: Integrated RAGAS evaluation system with 8 automated metrics (tool selection accuracy, evidence quality, answer relevancy, context precision), self-hosted Langfuse for complete operation tracing, and automated performance scoring

Architecture Refactor: Modular redesign with organized agents/, config/, handlers/, tools/, and evaluation/ directories for better maintainability and extensibility

Configuration System: Centralized configuration management with single source of truth, standardized output directories, and comprehensive environment variable support

Prompt Management: Advanced prompt system with optimization capabilities, Langfuse integration, automatic prompt rebuilding, and operation-specific tuning

Memory Enhancements: Improved Mem0 integration with per-target memory persistence, cross-operation learning, and enhanced evidence storage

LiteLLM Support: Universal model provider support for 100+ models from OpenAI, Anthropic, Google, Azure, and more through LiteLLM integration

Bearer Token Authentication: AWS Bedrock bearer token support for simplified authentication alongside traditional credentials

What's Changed

Frontend & UI:

  • Unified React terminal as default UI across all deployment modes by @westonbrown in #40
  • Multi-line text input and paste-aware components for enhanced interaction
  • Real-time event streaming with tool execution visibility
  • Theme improvements and banner display optimizations

Architecture & Configuration:

  • Standardized configuration inputs and centralized management by @konradsemsch in #22
  • Modular agent system with dedicated report agent
  • Standardized output directories for better organization by @konradsemsch in #24
  • Enhanced handler system with modular callbacks and event emission

Observability & Evaluation:

  • Integrated self-hosted Langfuse with detailed documentation
  • RAGAS evaluation with 8 automated metrics
  • Comprehensive trace parsing and performance monitoring
  • Automated evaluation pipeline

Testing & Quality:

  • 245 test suite covering config, memory, prompts, evaluation
  • Extensive integration tests for memory-aware prompts
  • Prompt optimizer and loader test coverage
  • CI improvements and dependency management

Documentation:

  • Complete architecture documentation with diagrams
  • User guide with deployment and troubleshooting
  • Memory system guide with backend configuration
  • Observability and evaluation documentation
  • Prompt management and optimization guides

Tools & Memory:

  • Enhanced Mem0 memory tool with improved error handling
  • Cross-operation memory persistence
  • Per-target memory storage
  • Report builder with filtering capabilities

Contributors:

Release Metrics

  • 183 commits with iterative improvements
  • 5,794 files changed: +1,116,955 insertions, -6,209 deletions
  • 245 tests passing with extensive coverage
  • Major version increment reflecting substantial architectural changes

New Contributors

Special thanks to @konradsemsch for extensive architecture improvements and @aggr0cr4g for critical bug fixes.

Full Changelog: v0.1.1...v0.1.3

Release v0.1.1

11 Jul 02:45

Choose a tag to compare

Improved Strands framework integration, enhanced memory management, and docker and local model support through ollama based on v0.1 failure mode analysis https://github.com/westonbrown/Cyber-AutoAgent/discussions/12

Key Changes

  • Local Model Support: Added Ollama integration for fully offline operation with configurable model endpoints
  • New Strands Tools: Integrated swarm tools and migrated to mem0 memory system for better agent orchestration and added stop tool for explicit agent termination control with reason tracking for safer operations
  • System Prompts: Overhauled prompts based on failure mode analysis to improve agent reliability
  • CI/CD & Docker: Added GitHub Actions workflows and optimized Docker support for containerized deployments

What's Changed

  • Dockerizes the app, improves packaging and handling of evidence persistance as well as report generation by @konradsemsch in #1
  • Add essential network tools to Docker container by @konradsemsch in #9
  • Fix Docker networking compatibility with automatic Ollama host detection by @konradsemsch in #10
  • Add initial benchmark testing framework by @aggr0cr4g in #13

New Contributors

Full Changelog: v0.1...v0.1.1

v0.1

08 Jun 21:26

Choose a tag to compare

First release of Cyber-AutoAgent, an autonomous cybersecurity assessment tool powered by AWS Bedrock and the Strands framework.

Key Features

  • Autonomous Security Assessment: Performs comprehensive penetration testing with minimal human intervention
  • Multi-Tool Integration: Built-in support for industry-standard tools (nmap, nikto, sqlmap, gobuster, metasploit, and more)
  • Intelligent Evidence Collection: Automatic documentation and memory management of findings
  • Meta-Tool Creation: Dynamically creates custom tools during assessments when needed

Technical Highlights

  • Built on Strands framework (v0.1.6+) for robust agent orchestration
  • FAISS vector store for efficient evidence retrieval
  • Mem0 integration for persistent memory across operations

Important Notes

  • This is experimental software intended for authorized security testing only