Skip to content

Latest commit

 

History

History
37 lines (27 loc) · 2.54 KB

File metadata and controls

37 lines (27 loc) · 2.54 KB

🛡️ AI Safety & Security: Add 3 Elite-Tier Security Projects

This PR adds 3 elite-tier open-source AI security projects to Section 10 (AI Safety, Alignment & Interpretability), all verified to meet the criteria of 1000+ GitHub stars, active development, and OSI-approved licenses.

Projects Added

Project Stars License Category Description
Rebuff 1,471+ Apache-2.0 Prompt Injection Detection LLM prompt injection detector with canary word detection. Detects and prevents prompt leakage attacks by embedding invisible canary tokens in prompts and monitoring for their exposure in model outputs.
RedAmon 1,836+ MIT Agentic Red Teaming AI-powered agentic red team framework that automates offensive security operations from reconnaissance to exploitation to post-exploitation with zero human intervention. Integrates multiple security tools for comprehensive penetration testing.
CAI 8,384+ MIT Cybersecurity AI Framework Cybersecurity AI framework for semi- and fully-automating offensive and defensive security tasks. Purpose-built for cybersecurity use cases with agent-based architecture for vulnerability assessment and security operations.

Elite Criteria Verification

All projects verified:

  • Stars: 1,471+ / 1,836+ / 8,384+ (all above 1,000 threshold)
  • Activity: All actively maintained (last commits within 3 months)
  • License: Apache-2.0 and MIT (OSI-approved)
  • Quality: Production-ready with comprehensive documentation
  • Unique: Not already in the repository

Why These Projects?

These additions fill important gaps in the AI security landscape:

  1. Rebuff adds specialized prompt injection detection via canary words - a unique defense mechanism not covered by existing tools
  2. RedAmon provides fully autonomous agentic red teaming - going beyond semi-automated tools to complete hands-off penetration testing
  3. CAI offers comprehensive cybersecurity automation with 8,000+ stars showing strong community adoption for both offensive and defensive security tasks

Checklist

  • Verified all projects have 1000+ stars
  • Verified OSI-approved open source licenses
  • Verified active development (commits within last 3 months)
  • Added to appropriate section (Adversarial & Red-teaming Tools)
  • Follows existing formatting and style
  • Projects not already in repository