Releases: westonbrown/Cyber-AutoAgent
Release v0.1.3: React Terminal UI, Evaluation System, Architecture Refactor
Major release introducing React-based terminal interface, automated evaluation system, and comprehensive architecture refactoring based on operational experience and community feedback. See detailed analysis at #41
Key Changes
React Terminal Interface: Complete UI overhaul with interactive React/Ink terminal as default interface, providing guided setup, real-time operation monitoring, and enhanced user experience across all deployment modes
Evaluation & Observability: Integrated RAGAS evaluation system with 8 automated metrics (tool selection accuracy, evidence quality, answer relevancy, context precision), self-hosted Langfuse for complete operation tracing, and automated performance scoring
Architecture Refactor: Modular redesign with organized agents/, config/, handlers/, tools/, and evaluation/ directories for better maintainability and extensibility
Configuration System: Centralized configuration management with single source of truth, standardized output directories, and comprehensive environment variable support
Prompt Management: Advanced prompt system with optimization capabilities, Langfuse integration, automatic prompt rebuilding, and operation-specific tuning
Memory Enhancements: Improved Mem0 integration with per-target memory persistence, cross-operation learning, and enhanced evidence storage
LiteLLM Support: Universal model provider support for 100+ models from OpenAI, Anthropic, Google, Azure, and more through LiteLLM integration
Bearer Token Authentication: AWS Bedrock bearer token support for simplified authentication alongside traditional credentials
What's Changed
Frontend & UI:
- Unified React terminal as default UI across all deployment modes by @westonbrown in #40
- Multi-line text input and paste-aware components for enhanced interaction
- Real-time event streaming with tool execution visibility
- Theme improvements and banner display optimizations
Architecture & Configuration:
- Standardized configuration inputs and centralized management by @konradsemsch in #22
- Modular agent system with dedicated report agent
- Standardized output directories for better organization by @konradsemsch in #24
- Enhanced handler system with modular callbacks and event emission
Observability & Evaluation:
- Integrated self-hosted Langfuse with detailed documentation
- RAGAS evaluation with 8 automated metrics
- Comprehensive trace parsing and performance monitoring
- Automated evaluation pipeline
Testing & Quality:
- 245 test suite covering config, memory, prompts, evaluation
- Extensive integration tests for memory-aware prompts
- Prompt optimizer and loader test coverage
- CI improvements and dependency management
Documentation:
- Complete architecture documentation with diagrams
- User guide with deployment and troubleshooting
- Memory system guide with backend configuration
- Observability and evaluation documentation
- Prompt management and optimization guides
Tools & Memory:
- Enhanced Mem0 memory tool with improved error handling
- Cross-operation memory persistence
- Per-target memory storage
- Report builder with filtering capabilities
Contributors:
- Container improvements and security documentation by @konradsemsch in #21
- OpenSearch SigV4 service name fix by @aggr0cr4g in #24
- Multiple improvements and refactors by @konradsemsch
- Core framework and evaluation by @westonbrown
Release Metrics
- 183 commits with iterative improvements
- 5,794 files changed: +1,116,955 insertions, -6,209 deletions
- 245 tests passing with extensive coverage
- Major version increment reflecting substantial architectural changes
New Contributors
Special thanks to @konradsemsch for extensive architecture improvements and @aggr0cr4g for critical bug fixes.
Full Changelog: v0.1.1...v0.1.3
Release v0.1.1
Improved Strands framework integration, enhanced memory management, and docker and local model support through ollama based on v0.1 failure mode analysis https://github.com/westonbrown/Cyber-AutoAgent/discussions/12
Key Changes
- Local Model Support: Added Ollama integration for fully offline operation with configurable model endpoints
- New Strands Tools: Integrated swarm tools and migrated to mem0 memory system for better agent orchestration and added stop tool for explicit agent termination control with reason tracking for safer operations
- System Prompts: Overhauled prompts based on failure mode analysis to improve agent reliability
- CI/CD & Docker: Added GitHub Actions workflows and optimized Docker support for containerized deployments
What's Changed
- Dockerizes the app, improves packaging and handling of evidence persistance as well as report generation by @konradsemsch in #1
- Add essential network tools to Docker container by @konradsemsch in #9
- Fix Docker networking compatibility with automatic Ollama host detection by @konradsemsch in #10
- Add initial benchmark testing framework by @aggr0cr4g in #13
New Contributors
- @konradsemsch made their first contribution in #1
- @aggr0cr4g made their first contribution in #13
Full Changelog: v0.1...v0.1.1
v0.1
First release of Cyber-AutoAgent, an autonomous cybersecurity assessment tool powered by AWS Bedrock and the Strands framework.
Key Features
- Autonomous Security Assessment: Performs comprehensive penetration testing with minimal human intervention
- Multi-Tool Integration: Built-in support for industry-standard tools (nmap, nikto, sqlmap, gobuster, metasploit, and more)
- Intelligent Evidence Collection: Automatic documentation and memory management of findings
- Meta-Tool Creation: Dynamically creates custom tools during assessments when needed
Technical Highlights
- Built on Strands framework (v0.1.6+) for robust agent orchestration
- FAISS vector store for efficient evidence retrieval
- Mem0 integration for persistent memory across operations
Important Notes
- This is experimental software intended for authorized security testing only